In this article, I would just explain the overview of the TPL in .NET 4 which will just get you started in understanding about it.
TPL stands for Task Parallel Library. This library was introduced in .NET 4 Framework.
In today’s world, the devices on which our code runs are powerful yet compact. These machines do come with multi processors or with multi cores. So to harness the power of these machines, we need to code more efficiently with smartness.
In the .NET world, we already had threads so far all these years. So why do we need TPL anyway?
Well, we needed much simpler coding interface with in the language in such a way that our code (multi-threaded) looks easier to read. Also the objects created (threads) were not lightweight and to solve the problem of harnessing concurrency in a better way than we used to do.
Earlier, one has to have an in depth strong knowledge on multi-threading concepts supported by the language also on the machine architecture level to code better utilizing the complete power of the architecture.
Before TPL, it was really hard to write code which was evenly distributed across processors or cores. Also, it was really bad to meddle with the OS about assigning jobs equally to processors/cores, but many times things would go chaos.
Hence, TPL enabled developers write code in a new way, which is much better than before. TPL helped in achieving much better results with same hardware with cleaner, simpler code being written.
So basically you can think TPL is just a wrapper library for
System.Threading APIs. But, with much more intelligence added. Hence, TPL has eased the interface of writing multi-threading code in .NET lately.
So, I'll show you some basic code in traditional vs TPL API!
Below is a small example, wherein you need to check the status of your service running in 5 different machines in your LAN from your application. So how do we code?
Now, in this way of coding, it is not pretty clear whether your code is really optimized and is exploiting the cores of your processor efficiently also lightweight with respect to threads. To make sure, you need to write more code in getting number of cores/processors you have in your machine and which core/processor is free and assign a job to it and get it done. Basically, the developer has to write more algorithms/intelligent code to exploit all the processors and get the best out of it.
Thankfully, TPL solves this overhead. So how you code in TPL, you ask?
To use TPL APIs, we use
System.Thread.Tasks namespace. This namespace has many APIs to use. One such we use now to solve the above problem is using
See, how simple and easier the code is. As said above, the TPL in the framework is smart enough to distribute the work evenly on the powerful machines which the developers need not worry about, also it is much more light weight objects and uses efficient
ThreadPool internally than traditional
In the above code, I have used anonymous delegate via lambda expression as the third argument. This is because,
Parallel.For API accepts
Action<T> delegate, hence internally the compiler converts your anonymous delegate to a delegate like in IL as shown in the image below:
You have to note that TPL won't work efficiently if the machine hardware doesn’t support multiple cores/processors. So thus running a TPL code on a single core/single processor machine will execute all tasks sequentially.
The beauty about the TPL unlike traditional Threading API is, each time you wish to create a task, a new Thread isn't created on the
workerthread pool. Rather, an existing
Task object is pulled out of the
TaskFactory pool, thus improving the overall efficiency. This
ThreadPool in TPL is smart enough to figure itself when it has to create a new object in its pool when it detects all the pool objects are getting used up. So everything is happening behind the scenes. Thus, TPL helps developer breath calm.
If you take a look at the above
Parallel.For code, we are not explicitly using any
ThreadPool. But rather the CLR internally uses task scheduler and pool objects to get the job done, but I must warn you that the tasks sequence will not be in order and neither you can control. This is because, in TPL each Task is defined as an independent unit of work and each Task could have more Threads in it.
As I already said, TPL is just a syntactic sugar provided in the framework. It won't add itself to Base Class Libraries. Because internally all existing concepts viz
WorkerThreads, etc. are getting used, but in a much efficient way.
Okay, so the above code shows a better way of starting a task in a loop. But in our daily coding life, we don’t get the same problem to solve or in other words it's not enough to solve all the problems. So we need to control/handle tasks individually on our way.
So how do we do it, you ask? Well, take a look at the below code.
We will use
Task class for that.
Task class provides various APIs to get the job done easily and in a cleaner manner.
So as you can see from above code, it's much simpler and nicer and all spaghetti algorithmic logic is left to the framework to take care for me.
Now if you take a closer look at the above code, we are using
Task.Factory. This is because as already said,
Task internally uses
ThreadPool objects to get the job done. Now you might wonder how to start, well as soon as you call
StartNew() on the
Task API, the tasks gets started in the background.
You may ask, can you postpone later, i.e., starting task. Yes you can, but not recommended. But to show you:
Now, as you can see from above code, the
Task internally uses Factory pool. So not explicitly specified. But, the trick comes when you have to choose
TaskScheduler to do your job.
TaskScheduler provides 2 properties,
Default so these two act differently in a different way. Hence, it is recommended to use
Task.Factory.StartNew rather than this. You can find a bit more indepth about
Task() by TPL team engineer blog:
Now if you remember, using traditional Threading APIs from
Thread class, you were able to assign your own name to the Threads being created in your code. But in TPL’s Task style, this is not possible. Since it uses every thing from pool. But it rather does gives us ID which is unique to each task. So you can get
Task.CurrentId value from the
Now let's get under the hood and see what's really happening at the IL level. The same Task line will be converted to
Task t = Task.Factory.StartNew(new Action(MainClass.MyTask)); by the compiler.
So upon disassembling the code, it looks like in the image:
Do not worry if you don’t understand anything in above image. It's the Intermediate Language which every .NET variant language converts high level language into.
So as you can see in line 15,
Task.TaskFactory is a property. Hence the compiler first calls
get_Factory() method internally. Why?, “because we know, at IL all properties, i.e., set and get accessors are converted to equivalent methods”. At line 17 and 18, it loads
MyTask() method via Action delegate. Then it calls
StartNew method using this delegate in line 19 to start the
Well, that’s all friends for now. There’s no use writing more in depth since it would be the same as others have written on the web. To dig deep into TPL, you can read an excellent article by sacha barber here.