|
I have been using a very fast and proprietary Java system for many years, well before .NET came to be. The average Java implementation being slow is due to the fact that everyone can create a JVM (just like everyone can write a compiler), it takes a professional and performance oriented approach to create a good one. "It works, lets ship it" isn't good enough. Not here, not anywhere if performance matters.
Luc Pattyn
Local announcement (Antwerp region): Lange Wapper? Neen!
|
|
|
|
|
No: nothing can even approach plain C language performance (well assembly can be better).
I'm talking about well written java applications vs well written C ones.
Believe me there's a reason why light speed is c .
On the other hand java has many many many good features, but, you know it is the "compile once, slow down everywhere" language (well, everywhere but the proprietary implementation you experienced...)
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
-- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
-- Iain Clarke
[My articles]
modified on Friday, September 25, 2009 3:45 PM
|
|
|
|
|
unmanaged code kills managed code for this type of thing.
we do all of our stuff in C++, with a bit of assembly for the MMX/SSE optimizations. sometimes we use the MMX/SSE intrinsics (which are essentially macros), but that's just being lazy. hand-coded assembly can beat the intrinsics.
nothing will beat an unmanaged pointer zipping across the image data for sheer speed.
of course, a decent algorithm can do wonders, too.
|
|
|
|
|
why isn't talking about matlab?
|
|
|
|
|
Hello Good people,am not so sure if this should go here,i am student who is interested in developing a search engine that indexes pages from my country.I have been doing my research on Algorithm to use for sometime now and i have found HITS and PageRank as the best out there.I have choosing to go with PageRank since it is more stable than the HITS Algorithm(so i read).
I have found countless articles and university researches about PageRank but my problem is that i do not understand most of the mathematical symbols that form the algorithm in this papers.Currently,i cannot understand how the Google Matrix(the irreducible,stochastic matrix) was calculated with the algorithm,i do not seem to understand the Algorithm used.
I did my reading from the articles below:
PDF 1
PDF 2
Please i need you to help me go through it,i need a basic explanation(examples will be nice) with less mathematical symbols.
Thanks in advance.
|
|
|
|
|
kentipsy wrote: Please i need you to help me go through it,i need a basic explanation(examples will be nice) with less mathematical symbols.
I don't think you will get a tutorial on statisitics and matrix arithmetic. I suspect that you will need to study the mathematics quite closely if you want to have any possibility of implementing this sort of product. However if you want a ready made solution you could always try one of these[^].
|
|
|
|
|
Hi i have a problem.
Thing like this:
This is multi-thread program, 1 master thread processing the msgs generated by child threads(typical consumer-producer model). I create a large share buf, all the children will put their msgs into this buf if they can get the buf lock. When a child put its msg into the buf completely, it will notify the master there is a msg waiting for process, so master aquire the buf lock and hold it and begin to deal with the msgs.
The problem is when there are a lot of child threads startup, there will be a lot of msgs waiting for process, then master thread will hold the the buf lock for a while which is not acceptable , as no child thread can put msg into it. This problem apparently causes poor performance of this program.
So what i try to do is to separete the large share buf to a array of small buf, and children can put their msgs into the available buf, and master can process msgs in each small share buf one by one.
But i don't know whether there is a good and mature algorithm or way to manage the share buf array.
Any good idea is welcome! Thanks.
BTW, i'm not sure whether i make the problem clear.(Because english is not my mother tongue, forgive me if it is not clear enough.) if you are interested in this problem, please feel free to contact me and ask me more details by email or
MSN: donniehan@live.cn
Thanks again!
|
|
|
|
|
Hi,
if you have multiple producers and a single consumer, then the following seem logical:
1.
the normal way of communication would be through a single queue; all producers can enqueue messages, the consumer will dequeue the messages. Other means of communication could be used if there is a need for them, e.g. when message priorities have to be established.
2.
the queue (or other comm means) should be locked as short as possible; it is only while dequeueing that the consumer needs to lock the queue, the processing of a dequeued messages should happen with the queue unlocked (otherwise the queue would locked all the time).
3.
on average the consumer should be faster than each of the consumers, as it has to handle all the messages, whereas each producer only needs to handle its share of them. This can be implemented by giving the consumer a higher thread priority (tricky on Windows), or by limiting the size of the queue (or other comm means), so when the queue gets full, the producers get blocked.
Luc Pattyn
Have a look at my entry for the lean-and-mean competition; please provide comments, feedback, discussion, and don’t forget to vote for it! Thank you.
Local announcement (Antwerp region): Lange Wapper? Neen!
|
|
|
|
|
Thanks for your reply and the idea.
But it seems not solve my problem. Firstly, i can't higher this master thread priority, actually the child threads are more important for me. Because each thread serves a client during its lifecycle. So as you said when the master hold the queue lock, it will lock the buf for a while, and it makes client thread waiting for a long time, which is my problem
Anyway thanks your answer! I will vote for you when i know how to vote!
|
|
|
|
|
donniehan wrote: the child threads are more important for me
that may well be, but then you MUST solve the locked problem.
donniehan wrote: Because each thread serves a client
that doesn't tell me much. If it were say serial ports where new data is coming in all the time, and risking data loss, then yes.
Whatever, if your consumer can't cope, the overall system will fail in the end, unless the lifespan of the app is limited, so you can just buffer sufficient data and have the consumer catch up after all the producers are done.
Could you show some code, I've got a hunch something is not the way it should be?
Luc Pattyn
Have a look at my entry for the lean-and-mean competition; please provide comments, feedback, discussion, and don’t forget to vote for it! Thank you.
Local announcement (Antwerp region): Lange Wapper? Neen!
|
|
|
|
|
My code is based on the pg src(Postgres Sql).Do you know well about the pg source?
The child thread serves the client,
it means it has a lot of other works to do and has to response to the client. it is not acceptable when the child thread halt and no response for a long time. And what i try to do is to allocate serval share small buf for all children use. When any child needs to use the buf, it can find one available from buf array and just put the data into it, release the buf quickly and do its other jobs. Master thread can process all msgs in the buf array one by one. It takes very short time for each msg to be handled by master.And buf is large enough for their using. That's the situation.
|
|
|
|
|
Hi,
no I have no experience with Postgres SQL, but that won't matter much I think.
donniehan wrote: It takes very short time for each msg to be handled by master.
That is very good. It will be key to your system working well.
My simple design would consist of:
- a single Message class that will be instantiated a number of times; the instances will be recycled without loading the garbage collector.
- an "emptyCollection" holding a number of empty message objects; think of it as a queue, although other kinds of collections could work too.
- an "emptyLock" object used to guard operations on emptyCollection;
- a "filledCollection" holding a number of filled message objects; again think of it as a queue.
- a "filledLock" object used to guard operations on filledCollection.
- emptyCollection+emptyLock and filledCollection+filledLock could be two instances of a single class MySafeCollection.
The initialization code would create a sufficient number of message instances and put them in the emptyCollection; the filledCollection is still empty.
Then the system really starts and performs these steps:
1. the producers obtain a message object quickly (getting one from emptyCollection, this must be protected by locking on emptyLock, which never will take long);
2. let the producer fill the message object;
3. let the producer put its filled message into the "filledCollection", using the filledLock, which again will not take long.
4. let the consumer obtain a filled message object from the filledCollection(again shortly using the filledLock).
5. let the consumer process the message; as this is a short operation and should never become a bottleneck I would consider doing this on a thread with slightly higher priority.
6. let the consumer return the message to the "emptyCollection" using the emptyLock.
Possible refinements:
A. You still need a signaling mechanism: each time a producer adds something to the filledCollection, it should send a signal to the single consumer, so it goes and fetches one (or more) messages from the filledCollection. This avoids the need for a polling loop in which the consumer would waste CPU cycles when no messages got filles.
B. A producer needing a message object and not finding one in the "emptyCollection" could create a new one; that would make the initialization unnecessary, however when things go terribly wrong, you may end up with a huge number of messages in the filledCollection.
I have implemented the above scheme numerous times inside embedded systems (i.e.using a proprietary operating system, not .NET or Windows). I know it works very well and is very performant.
Luc Pattyn
Have a look at my entry for the lean-and-mean competition; please provide comments, feedback, discussion, and don’t forget to vote for it! Thank you.
Local announcement (Antwerp region): Lange Wapper? Neen!
|
|
|
|
|
Hi
thanks for your great idea.
But i still have a few problems:
1st, i can only use pure C to program(Well, it will not be the real problem,i can use structure instead )
2nd, the size of each message is not fixed. a message maybe very small(16B) or very big(>10KB). if i apply your idea to my system, it will make tiny memory pieces in this share memory, so i must implement a function to recycle the memory pieces and it will also cost some time. When the system is very busy and producer create many msgs, the system must call recycle function frequently to avoid memory pieces problem and it will cost much time to do recycle work.---------->That's the big problem for me!
The Share Memory is a block of contiguous memory allocated at first system startup. The maxsize of this Share Memory is fixed. it is not allowed to allocate another free share memory after system startup.
It is a quite annoying problem, isn't it?
|
|
|
|
|
Hi,
1.
language does not really matter, I've done what I suggested in C many times.
2.
when message size isn't fixed, I find it easiest to use a few different fixed sizes; i.e. I would define appropriate values size1<size2<size3 (with size3 large enough to hold any message); and preallocate a number of buffers for each of those; then use as many emptyCollections as there are sizes, and use only one filledCollection.
3.
Not sure why you would need contiguous memory, unless some DMA-based peripheral is involved.
Luc Pattyn
Have a look at my entry for the lean-and-mean competition; please provide comments, feedback, discussion, and don’t forget to vote for it! Thank you.
Local announcement (Antwerp region): Lange Wapper? Neen!
modified on Monday, September 21, 2009 8:06 AM
|
|
|
|
|
hehe~~~
Thanks again for your help!
I almost solve this problem.
|
|
|
|
|
I fixed a HTML problem in my previous post.
Luc Pattyn
Have a look at my entry for the lean-and-mean competition; please provide comments, feedback, discussion, and don’t forget to vote for it! Thank you.
Local announcement (Antwerp region): Lange Wapper? Neen!
|
|
|
|
|
Can anyone guide me how to write the Hough circular transform?
|
|
|
|
|
This is used to recognize a circle in an image with incomplete data and/or noise. Every white pixel (assuming a black background) is evidence for a circle that goes through it. You go through all the white pixels and accumulate evidence for all the possible circles. The circle with the most evidence is the result of the Hough transform.
When you find a white pixel, it could be on many circles of radius r. In fact, every point that's distance r away from the white pixel COULD be the center of the circle. In other words, a white pixel is evidence for the center of the circle at all points r away from the pixel, i.e. a new circle with center at the white pixel and radius r.
So you start with an array with all cells set to zero. For every white pixel, increment all array cells at distance r from the white pixel. When you're done, the cell with the highest score is probably the center of the circle.
|
|
|
|
|
Great explanation. What would you suggest for finding circles of arbitrary size though, so radius r is unknown?
Luc Pattyn
Have a look at my entry for the lean-and-mean competition; please provide comments, feedback, discussion, and don’t forget to vote for it! Thank you.
Local announcement (Antwerp region): Lange Wapper? Neen!
|
|
|
|
|
In that case, you have a three-dimensional space for the circle parameters: center X, center Y, and r.
So instead of a two-dimensional array, you have a three-dimensional array where the extra dimension is for the radius.
When you hit a white pixel, you increment the cells in a circle at EACH level in the r dimension (with the radius given by the r level). As in the two-dimensional case, the cell with the highest score gives the parameters for the circle with the most evidence. But here the r level also gives you the radius.
|
|
|
|
|
I understand that, however now you are using a lot of memory (it has one more dimension than the original image) and incrementing a huge number of counters. Seems like it will take forever, I would hope there is something with better performance...
Luc Pattyn
Have a look at my entry for the lean-and-mean competition; please provide comments, feedback, discussion, and don’t forget to vote for it! Thank you.
Local announcement (Antwerp region): Lange Wapper? Neen!
|
|
|
|
|
This is a bit of a hack, but it worked fine for me and was quicker then a hough trans.
I was suppose to detect black framed circles in a microscopy image.
What I did was generatin a circle template, starting with 4 pixel in diameter and increase to the width of the image. At each step I correlated my circle template with the image. The highest scoring correlation output was stored. So in the end you know where and which size the circles are.
I did this in Matlab where it was actually easy to do and it was quicker then the hough trans detection and normally more accurate as well.
Cheers
You have the thought that modern physics just relay on assumptions, that somehow depends on a smile of a cat, which isn’t there.( Albert Einstein)
|
|
|
|
|
These days memory is cheap, and it's possible to draw a circle quickly using only fast integer operations: http://en.wikipedia.org/wiki/Midpoint_circle_algorithm[^].
If you have bounds on the circle's radius, there are a couple of optimizations:
1. It limits the size of the 3D array in the r dimension, and
2. You can preprocess the image to remove convex blobs with radius greater than the minimum radius. These blobs are just noise, and eliminate lots of white pixels which reduces processing time.
|
|
|
|
|
Yep. Actually we did the circle algorithm once in hardware, eons ago, using simple TTL logic. A priority encoder (74148 IIRC) works wonders here (today I would use FPGAs, not TTL).
And another reply hinted on radius estimation. So things start to look better and better.
Luc Pattyn
Have a look at my entry for the lean-and-mean competition; please provide comments, feedback, discussion, and don’t forget to vote for it! Thank you.
Local announcement (Antwerp region): Lange Wapper? Neen!
|
|
|
|
|
Luc,
That was also my immediate question. What might be done is to just scan the image for the white (or black) pixels and set the array values just to a 1 for the pixel position itself. Once done, look for adjacent pixels in the first array (horizontally, vertically, or diagnally), either immediately adjacent or separated by 1 pixel (or two or three or ... if a very sparse image). Once such a pair is found, walk the adjacent pixels at both ends to extend the string, then create the equation of a line through the endpoints (a chord), and more importantly, the equation of the perpendicular bisector of the chord, saving this bisector's equation (y = mx+b, slope intercept form). Watch out for a joined line (a complete circular arc - on a clear disk you can seek forever), if found, take the lowest (or highest) 1/4 of the pixels as one arc, and the leftmost (or rightmost) 1/4 of the pixels as the other arc (saving these two arcs in a special array of pairs). Erase the found points in the array, and look for other arcs (connected or not). Now for the fun. Calculate the intersections of the bisectors, starting with the circular pairs (giving the center of the circle and the distance from the center to all ends, averaged, as the radius), then on to the disjointed arcs. For the disjointed arcs, calculate a possible center for all combinations, and choose the closest groupings as a center, eliminate these arcs and repeat (maybe more than 1 circle in the image). Caution, since this is an image, the image of a circular disk could appear as an ellipse and not a circle.
Think this might work?
Dave.
Edit: all of the other answers came in while I was writing this up!
modified on Wednesday, September 16, 2009 11:29 AM
|
|
|
|
|