Click here to Skip to main content
15,214,135 members
Rate this:
Please Sign up or sign in to vote.
See more:

I trying to process live stockMarket data and insert and update a data with the results. I'm using the consumer producer queue design pattern which I have threaded.

Some of the calculations are VERY intensive and degrading the performance of the database. I can't seem to figure how to go about processing, inserting/updating the database with the results.

Can some please give me advice on how to go about setting this up properly?

Albin Abel 28-Apr-11 16:03pm
Good question. My 5
Nish Nishant 28-Apr-11 16:07pm
My 5 too.
AspDotNetDev 28-Apr-11 16:10pm
Take it step by step. Give us a specific example of something that is too slow. In general, make sure you have the right indexes and use the query plan to figure out problem areas.
Monjurul Habib 28-Apr-11 18:49pm
my 5.
Rate this:
Please Sign up or sign in to vote.

Solution 1

This is a very general idea I am throwing in. If, and it's an important "if", part of the slowness is due to the managed code, you may want to move some of the more intensive calculations into a fast C++ written library. You could call into it via COM or C++/CLI (among other options).
Albin Abel 28-Apr-11 16:16pm
My 5, good alternative
Nish Nishant 28-Apr-11 16:23pm
Thanks (comment threading is all messed up)
Nish Nishant 28-Apr-11 16:23pm
Thank you, Albin.
Sergey Alexandrovich Kryukov 28-Apr-11 17:01pm
Makes sense, a 5.
What do you think about my idea? Something tells me it can be more effective. It depends on those calculation and the rest of architecture and business logic though.
Please see my answer.
Nish Nishant 28-Apr-11 17:03pm
Already saw it, voted 5 too. Up to the OP to think of these approaches though.
Monjurul Habib 28-Apr-11 18:48pm
my 5.
Rate this:
Please Sign up or sign in to vote.

Solution 2

I can see that your heavy calculation part could compromise the total throughput of the system, but I don't see why it has to degrade the performance of the database. What is the bottleneck: the calculations themselves or additional transactions for intermediate results? If the transaction make a bottleneck you need to cash the data. I cannot believe you can do correct calculation of an ever-changing database anyway.

If you already developing the consumer/producer queue approach you can more or less easily move big part of processing onto another machine. I would suggest you dedicate a separate tier just for your calculation part. It can run on a separate machine and increase parallelism.

Nish Nishant 28-Apr-11 17:00pm
Voted 5!
Sergey Alexandrovich Kryukov 28-Apr-11 17:01pm
Thank you, Nishant.
How could you be so fast?
Nish Nishant 28-Apr-11 17:02pm
yesotaso 28-Apr-11 17:17pm
Voted 5. I was thinkg same:"Intense calculation <-?-> Degrade database performance" :) Anyway, observing a producer filling bottomless buffer or a consumer eating endless data may show where performance problem lies.
Sergey Alexandrovich Kryukov 28-Apr-11 20:57pm
Thank you very much. Agree with you.
Actually, observing/profiling how much CPU is used by each tier is not enough. A work flow can be badly unbalanced with defeats parallelism. I guess you're describing a case like that.
yesotaso 30-Apr-11 11:39am
Indeed I am. For instance, you have some horses, carriages and a loading dock. To solve low performance you need to know how good are your horses, how balanced are your carriages, how good is your crane operators. If your operator is drinking at work or a horse is running to death with empty carriage or a huge carriage kiling horses you have a problem...
Absolutely right. I can clearly see what are you talking about, especially after this morning when I worked with real horses a bit, no problems though... :-)
d.allen101 28-Apr-11 17:26pm
Nishant I'm actually using your blocking queue class. I have 2 tiers that are using the blocking queue class - logic and database. the problem (bottle neck) is in the logic tier which is running on it's own thread. the processing in this thread causes my cpu resources to go above 90%
Sergey Alexandrovich Kryukov 28-Apr-11 20:54pm
Donald, are you talking to Nishant or to me? What you say confirms my idea. Blocking queue is a very good way of synchronization with data flow, something with I hope you use, but between thread of the same process. You can do the same between processes, on the same machine or different one (so, in a scalable way) using sockets or remoting/WCF.
How many CPUs/Cores are you using? You can improve it, too. Are you close to memory limitation? In this case a lot of burned could be on memory swapping...
Monjurul Habib 28-Apr-11 18:47pm
my 5.
Sergey Alexandrovich Kryukov 28-Apr-11 20:48pm
Thank you, Monjurul.
Reza Ahmadi 25-Apr-12 9:09am
my 5!
Sergey Alexandrovich Kryukov 25-Apr-12 10:40am
Thank you, Reza.
Rate this:
Please Sign up or sign in to vote.

Solution 3

For I have done similar things at university, I think I know where your problem is.
For instance, I did some testing (c#) on just a few hundred thousands of datasets on a sql developer machine. The performance was damn slow compared with a perl solution using in-memory and simple file based storage.

I remember, one weekend my multithreaded app was blocking the whole multicore system and university backbone. This perl program I wrote some time ago was fetching stock data from servers around the world comparing terabytes of data again and again, extracting, filtering, completing extrapolating data and even processing some images for visualization. One thing I can tell is, that a well-designed program with no database at all, interpreted by a well-chosen script compiler like perl (which is known for fast parsing capability), can outperform any precompiled high level managed code application easily. It's like choosing the right tools for a certain task.

From my current point of view, for this kind of application (high data, high access, complex operations - I call it hidaco - and in my case image processing), a standard approach of database programming is a NO-GO! Personally I think Database performance is well overestimated. Though financial manners are most often taken into transaction models because of reliability, this is fatal choice when it comes to performance considerations. Well, my approach was to reduce database activity to the minimum (means zero, I wrote my own). For you, that means doing some caching and maybe kind of creating your own database, or better, consider using an in-memory database (see Google). For recurs computations like neural networks and ai (like aforge or opencv) are much more intense than (well defined and deterministic) financial math, computation is (IMHO) not a bottleneck, nor is managed code. Any SQL may become a bottleneck very easy. Try at least two in-memory databases (see imdb on wiki for a list). If your performance increases, you should redesign your sql statements to get to the max. Well, I bet it will tremendously increase, but if it does not, take the c++ way (use an externally financial math library with c# wrapper) for performance testing.

Another approach would be to expand your SQL-Server / Database capabilities. There is a YouTube video about YouTube’s sizing problems during different periods of growth out there - just a hint, but takes me to the last point ;-)

One last word on common pit falls. I assume that processing live stock data means fetching data over any kind of network!? Please be aware of any limits on connection handling starting with maximum simultaneous connections/ sockets/ ports, bandwidth issues, packet-/session-timeouts and misconfiguration (even on the physical side -> network) and whatever may come.

And last but not least, let us know.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100