Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C#
Hi,
 
I trying to process live stockMarket data and insert and update a data with the results. I'm using the consumer producer queue design pattern which I have threaded.
 
Some of the calculations are VERY intensive and degrading the performance of the database. I can't seem to figure how to go about processing, inserting/updating the database with the results.
 
Can some please give me advice on how to go about setting this up properly?
 
Thanks,
-Donald
Posted 28-Apr-11 10:01am
Comments
Albin Abel at 28-Apr-11 16:03pm
   
Good question. My 5
Nishant Sivakumar at 28-Apr-11 16:07pm
   
My 5 too.
AspDotNetDev at 28-Apr-11 16:10pm
   
Take it step by step. Give us a specific example of something that is too slow. In general, make sure you have the right indexes and use the query plan to figure out problem areas.
Monjurul Habib at 28-Apr-11 18:49pm
   
my 5.
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

This is a very general idea I am throwing in. If, and it's an important "if", part of the slowness is due to the managed code, you may want to move some of the more intensive calculations into a fast C++ written library. You could call into it via COM or C++/CLI (among other options).
  Permalink  
Comments
Albin Abel at 28-Apr-11 16:16pm
   
My 5, good alternative
Nishant Sivakumar at 28-Apr-11 16:23pm
   
Thanks (comment threading is all messed up)
Nishant Sivakumar at 28-Apr-11 16:23pm
   
Thank you, Albin.
SAKryukov at 28-Apr-11 17:01pm
   
Makes sense, a 5. What do you think about my idea? Something tells me it can be more effective. It depends on those calculation and the rest of architecture and business logic though. Please see my answer. --SA
Nishant Sivakumar at 28-Apr-11 17:03pm
   
Already saw it, voted 5 too. Up to the OP to think of these approaches though.
Monjurul Habib at 28-Apr-11 18:48pm
   
my 5.
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

I can see that your heavy calculation part could compromise the total throughput of the system, but I don't see why it has to degrade the performance of the database. What is the bottleneck: the calculations themselves or additional transactions for intermediate results? If the transaction make a bottleneck you need to cash the data. I cannot believe you can do correct calculation of an ever-changing database anyway.
 
If you already developing the consumer/producer queue approach you can more or less easily move big part of processing onto another machine. I would suggest you dedicate a separate tier just for your calculation part. It can run on a separate machine and increase parallelism.
 
—SA
  Permalink  
v2
Comments
Nishant Sivakumar at 28-Apr-11 17:00pm
   
Voted 5!
SAKryukov at 28-Apr-11 17:01pm
   
Thank you, Nishant. How could you be so fast? --SA
Nishant Sivakumar at 28-Apr-11 17:02pm
   
:-)
yesotaso at 28-Apr-11 17:17pm
   
Voted 5. I was thinkg same:"Intense calculation <-?-> Degrade database performance" :) Anyway, observing a producer filling bottomless buffer or a consumer eating endless data may show where performance problem lies.
SAKryukov at 28-Apr-11 20:57pm
   
Thank you very much. Agree with you. Actually, observing/profiling how much CPU is used by each tier is not enough. A work flow can be badly unbalanced with defeats parallelism. I guess you're describing a case like that. --SA
yesotaso at 30-Apr-11 11:39am
   
Indeed I am. For instance, you have some horses, carriages and a loading dock. To solve low performance you need to know how good are your horses, how balanced are your carriages, how good is your crane operators. If your operator is drinking at work or a horse is running to death with empty carriage or a huge carriage kiling horses you have a problem...
SAKryukov at 1-May-11 1:31am
   
Absolutely right. I can clearly see what are you talking about, especially after this morning when I worked with real horses a bit, no problems though... :-) --SA
Donald Allen at 28-Apr-11 17:26pm
   
Nishant I'm actually using your blocking queue class. I have 2 tiers that are using the blocking queue class - logic and database. the problem (bottle neck) is in the logic tier which is running on it's own thread. the processing in this thread causes my cpu resources to go above 90%
SAKryukov at 28-Apr-11 20:54pm
   
Donald, are you talking to Nishant or to me? What you say confirms my idea. Blocking queue is a very good way of synchronization with data flow, something with I hope you use, but between thread of the same process. You can do the same between processes, on the same machine or different one (so, in a scalable way) using sockets or remoting/WCF. How many CPUs/Cores are you using? You can improve it, too. Are you close to memory limitation? In this case a lot of burned could be on memory swapping... --SA
Monjurul Habib at 28-Apr-11 18:47pm
   
my 5.
SAKryukov at 28-Apr-11 20:48pm
   
Thank you, Monjurul. --SA
Reza Ahmadi at 25-Apr-12 9:09am
   
my 5!
SAKryukov at 25-Apr-12 10:40am
   
Thank you, Reza. --SA
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 3

For I have done similar things at university, I think I know where your problem is.
For instance, I did some testing (c#) on just a few hundred thousands of datasets on a sql developer machine. The performance was damn slow compared with a perl solution using in-memory and simple file based storage.
 
I remember, one weekend my multithreaded app was blocking the whole multicore system and university backbone. This perl program I wrote some time ago was fetching stock data from servers around the world comparing terabytes of data again and again, extracting, filtering, completing extrapolating data and even processing some images for visualization. One thing I can tell is, that a well-designed program with no database at all, interpreted by a well-chosen script compiler like perl (which is known for fast parsing capability), can outperform any precompiled high level managed code application easily. It's like choosing the right tools for a certain task.
 
From my current point of view, for this kind of application (high data, high access, complex operations - I call it hidaco - and in my case image processing), a standard approach of database programming is a NO-GO! Personally I think Database performance is well overestimated. Though financial manners are most often taken into transaction models because of reliability, this is fatal choice when it comes to performance considerations. Well, my approach was to reduce database activity to the minimum (means zero, I wrote my own). For you, that means doing some caching and maybe kind of creating your own database, or better, consider using an in-memory database (see Google). For recurs computations like neural networks and ai (like aforge or opencv) are much more intense than (well defined and deterministic) financial math, computation is (IMHO) not a bottleneck, nor is managed code. Any SQL may become a bottleneck very easy. Try at least two in-memory databases (see imdb on wiki for a list). If your performance increases, you should redesign your sql statements to get to the max. Well, I bet it will tremendously increase, but if it does not, take the c++ way (use an externally financial math library with c# wrapper) for performance testing.
 
Another approach would be to expand your SQL-Server / Database capabilities. There is a YouTube video about YouTube’s sizing problems during different periods of growth out there - just a hint, but takes me to the last point Wink | ;-)
 
One last word on common pit falls. I assume that processing live stock data means fetching data over any kind of network!? Please be aware of any limits on connection handling starting with maximum simultaneous connections/ sockets/ ports, bandwidth issues, packet-/session-timeouts and misconfiguration (even on the physical side -> network) and whatever may come.
 
And last but not least, let us know.
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Your Filters
Interested
Ignored
     
0 Shai Vashdi 1,518
1 OriginalGriff 478
2 Manas Bhardwaj 349
3 Tadit Dash 285
4 Damith Weerasinghe 260
0 Sergey Alexandrovich Kryukov 9,575
1 OriginalGriff 5,856
2 Peter Leow 4,405
3 Maciej Los 3,540
4 Abhinav S 3,513


Advertise | Privacy | Mobile
Web04 | 2.8.140415.2 | Last Updated 25 Apr 2012
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Use
Layout: fixed | fluid