Introduction
Multithreading has always impressed me. Being able to do lots of things at once is really impressive, but we can't do this if we don't have the proper hardware. Till now, all we could do is separate the hard CPU work in a background thread and thus leave the user interface unblocked. I wanted to go further than this, exploiting the newest capabilities of new CPUs (at user’s hands) and try to get a real working multithreading example, that is, with more than one thread running in the background.
That is what this article is all about, and I have to say that the results have impressed me. Hope you will find it interesting. Well, in a multi CPU server with 4 CPUs, the benefits are of 280% (in a CPU-intensive job) and on normal machines with non-CPU intensive jobs, it can go from 500% to 1000% in performance improvements…
Background
There are a lot of introductory articles to multithreading for .NET 2.0 and, I have to say, they have helped me a lot. What I have used is the BackgroundWorker
.NET 2.0 component (but there are code implementations for it on .NET 1.1 that do the job.
Here I put some links to these articles:
Well, these are must read articles if you're new to the threading world or if you're not but want to get updated with the new NET 2.0’s BackgroundWorker
component...
Problems Addressed
Any kind of problem… be it a processor intensive or a normal task:
Processor intensive task: It can be divided into one, two or more threads which will be used by a CPU each (multiplying the performance by each processor..)
Normal task: Each “normal” task if done sequentially has a “delay” whenever it accesses the file system for reading or writing data, if it gets access to data storage or if it uses a web service… all of this means time that normally is unused and counts as a delay for the user response or the task itself. With a multiple threaded job management, there will be no delay. This time will be assigned and used by parallel tasks and used, not lost. That will result in, per example on a 100 package with 100ms of delay each the performance difference of a 1 thread model to a 20 thread model was about 1000% in performance.
Let’s say that if the problem is building the blocks of a web site page, instead of doing these sequentially, taking 1-4 seconds into having all the sections built; the banner, the users online, the last articles, the most voted tools, etc… what if we could build all of these asynchronously and when they're built up, send them to the user? We will save the webservice calls, the database calls and a lot of other precious time… and more on! These calls would be serviced faster and that would mean that the possibilities of coinciding the calls would be reduced, increasing the response times substantially. Interesting?
The Solution is Already Here...
It is called BackgroundWorker
and for our intentions, we will subclass it. Background worker helps us to set-up a “Worker” for doing a work in an asynchronous way.
What we want to do is set up a Factory (oops, no design patterns meaning here) where one kind of job will be done, that will mean that we will have a kind of job, some process, and some workers that know how to do this job.
Of course, we will need a manager for assigning the jobs to the workers and what to do when they reach a step of the job and when they finish it. And yes, also we want the manager to be able to speak to the workers to stop. They have to take rest too! And when the manager says so, they must stop!
We will explain things from bottom to top, beginning from the Worker and then we will see the Manager.
The Worker
It is a subclass of the Background worker
, we set up the constructor to assign to true
two properties of BackgroundWorker
that are WorkerReportsProgress
and WorkerSupportsCancellation
which will enable us to do what the names say, report progress, normally to a UI and cancel the job (and subsequently all jobs) if they take too long. We also assign an id number to each worker. The manager needs to control them, though. Here’s the code:
public class MTWorker : BackgroundWorker
{
#region Private members
private int _idxLWorker = 0;
#endregion
#region Properties
public int IdxLWorker
{
get { return _idxLWorker; }
set { _idxLWorker = value; }
}
#endregion
#region Constructor
public MTWorker()
{
WorkerReportsProgress = true;
WorkerSupportsCancellation = true;
}
public MTWorker(int idxWorker)
: this()
{
_idxLWorker = idxWorker;
}
#endregion
Also, we will override another of the BackgroundWorker
’s methods, in fact, the most interesting one, which does the real Job. And it means it. It’s name is OnDoWork
and it is the method that is called when we invoke or launch the multithreaded task. Here we manage the start-up of the task, its progress, its cancellation and its completion. I have added two possible jobs to do, one “Normal” that emulates with delays the waiting time of a non CPU-intensive task which has to ask and wait for filesystem, network, database or webservices calls… and the other which is a CPU intensive Job: Calculating the PI number. You can play with it and see the results of giving more or less delay and increasing the thread’s number (Oops, I meant the worker’s numbers…).
Here is the OnDoWork
code:
protected override void OnDoWork(DoWorkEventArgs e)
{
int digits = (int)e.Argument;
double tmpProgress = 0;
int Progress = 0;
String pi = "3";
this.ReportProgress(0, pi);
Boolean bJobFinished = false;
int percentCompleteCalc = 0;
String TypeOfProcess = "NORMAL";
while (!bJobFinished)
{
if (TypeOfProcess == "NORMAL")
{
#region Normal Process simulation, putting a time
delay to emulate a wait-for-something
while (!bJobFinished)
{
if (CancellationPending)
{
e.Cancel = true;
return;
}
Thread.Sleep(250);
percentCompleteCalc = percentCompleteCalc + 10;
if (percentCompleteCalc >= 100)
bJobFinished = true;
else
ReportProgress(percentCompleteCalc, pi);
}
#endregion
}
else
{
#region Pi Calculation - CPU intensive job,
beware of it if not using threading ;) !!
if (digits > 0)
{
pi += ".";
for (int i = 0; i < digits; i += 9)
{
int nineDigits = NineDigitsOfPi.StartingAt(i + 1);
int digitCount = System.Math.Min(digits - i, 9);
string ds = System.String.Format("{0:D9}", nineDigits);
pi += ds.Substring(0, digitCount);
tmpProgress = (i + digitCount);
tmpProgress = (tmpProgress / digits);
tmpProgress = tmpProgress * 100;
Progress = Convert.ToInt32(tmpProgress);
ReportProgress(Progress, pi);
if (CancellationPending)
{
bJobFinished = true;
e.Cancel = true;
return;
}
}
}
bJobFinished = true;
#endregion
}
}
ReportProgress(100, pi);
e.Result = pi;
}
The Manager
Here is what is more fun and I am pretty sure that it can be improved a lot – any comments or improvements are welcome! What it does is generate and configure a Worker for each Thread and then it assigns the jobs to them. By now the only parameter it passes to the worker is a number, but it could pass a class or struct with all the job definition… A possible upgrade would be to implement a strategy pattern here for choosing how to do the internal job.
Well we then call the InitManager
which configures the jobs, its number, the specs of the jobs to do and then creates an array of MultiThread
Workers and configures them. The configuration code follows:
private void ConfigureWorker(MTWorker MTW)
{
MTW.ProgressChanged += MTWorker_ProgressChanged;
MTW.RunWorkerCompleted += MTWorker_RunWorkerCompleted;
}
Like this, the Worker’s subclassed thread management methods are linked to the methods held by the Manager. Note that with a Strategy pattern implemented, we could assign these to the proper manager for these methods.
Then we have the most important method, the AssignWorkers
. What it does is check all the workers and if there is anyone that is not working, it assigns a job to it. That is, if there are jobs left to process. When it finishes checking workers, if we found that there is no worker working (and thus we have not assigned any job too) that will mean the end of the jobs. No more to do, everything’s done!
Here’s the code:
public void AssignWorkers()
{
Boolean ThereAreWorkersWorking = false;
foreach (MTWorker W in _arrLWorker)
{
if (W.IsBusy == false)
{
if (_iNumJobs > _LastSentThread)
{
_LastSentThread = _LastSentThread + 1;
W.JobId = _LastSentThread;
W.RunWorkerAsync(_iPiNumbers);
ThereAreWorkersWorking = true;
}
}
else
{
ThereAreWorkersWorking = true;
}
}
if (ThereAreWorkersWorking == false)
{
Button BtnStart = (Button)FormManager.Controls["btnStart"];
Button BtnCancel = (Button)FormManager.Controls["btnCancel"];
BtnStart.Enabled = true;
BtnCancel.Enabled = false;
MessageBox.Show("Hi, I'm the manager to the boss (user): " +
"All Jobs have finished, boss!!");
}
}
We call this method whenever a job is finished. This way it ensures the completion of all jobs.
We also link the form through a property of this class so we can associate it to any form we want. Well we could want to link it to another class, but this is the most normal thing to do.
Well… improving it, we could get a BackgroundManager
for all our application needs.
The UI
Last but not less important, we link all this to a form. The code is minimal and it’s pretty simple: we add a reference to the Manager, we configure it on the form’s constructor and on a start button, we call the Manager’s LaunchManagedProcess
.
private MTManager LM;
public Form1()
{
InitializeComponent();
LM = new MTManager(this, 25);
LM.InitManager();
}
private void btnStart_Click(object sender, EventArgs e)
{
btnStart.Enabled = false;
btnCancel.Enabled = true;
LM.LaunchManagedProcess();
}
private void btnCancel_Click(object sender, EventArgs e)
{
LM.StopManagedProcess();
btnCancel.Enabled = false;
btnStart.Enabled = true;
}
Trying it!
This is the funniest part, changing the properties of how many threads to run simultaneously and how many Jobs to be processed and then try it on different CPUs… ah, and of course, change the calculation method from a CPU-intensive task to a normal task with a operation delay...
I would love to know your results and what you have done with this. Any feedback would be great!!
Exercises For You…
This is not done! It could be a MultiThreadJob
Framework if the following is being done:
- Implement a Strategy pattern that determines the kind of Worker to produce (with a factory pattern) so we will be able to do different kind of jobs inside the same factory.. what about migrating a database and processing each table in a different way… or integrating systems with this engine…
- Implement -or extend- the strategy pattern for determining the treatment for the Input data and the result data of the jobs. We could set-up a factory too for getting the classes into an operating environment.
- Optimize the
AssignWorkers
engine – I am pretty sure it can be improved. - Improve the
WorkerManager
class in order to be able to attach it to another class instead of only to a form. - Send me the code! I would love to hear from you and what you have done.
History
- 5th November, 2006: Initial post