Introduction
Microsoft recently released the Visual Studio 2010 CTP. One of the most interesting features in this release, apart from Dynamic C#, was Task Parallel Library. TPL and PFX (Parallel Framework) are available as a CTP for VS2008 and .NET Framework 3.5. Task Parallel Library is designed to target multiple core or processor systems. It is designed to exploit the benefits of advanced hardware available without even worrying about them. The article below discusses the available APIs in PFX, and explores the performance aspect.
Development Environment
Concurrent Programming and Parallel Programming
Concurrent and Parallel Programming are very old buzz words. Both are very closely associated. Concurrent Programming is the concurrent (simultaneous) execution of multiple interacting computational tasks, which are implemented as separate programs, or as a set of processes or threads created by a single program. These tasks can be executed on a single processor, multiple processor, or on a distributed network. Parallel Programming is breaking a large problem into smaller ones and carrying the execution of each chunk simultaneously. There can be different forms of parallel computing like bit-level, instruction-level, data and task-parallelism. Of these two forms, the ones that are important in the context of Task Parallel Library are Data-Parallelism and Task-Parallelism.
Data Parallelism
In data parallelism, the focus is on distributing data across different parallel computing nodes. For a multiprocessor system executing a single set of instructions, data parallelism can be achieved by making each processor perform the same task on different pieces of distributed data. There can be a single thread controlling this, or can be multiple threads. However, the processing of data would be the same for each thread.
Task Parallelism
Task parallelism is the execution of a different thread by each processor on the same or different data. The threads may execute the same or different code. For a simple example for a two processor machine, if we want to do two tasks, it will be CPU ‘a’ doing task ‘A’ and CPU ‘b’ doing task ‘B’.
Task parallelism emphasizes the distributed (parallelized) nature of the processing (i.e., threads), as opposed to the data (data parallelism).
Overview of Parallel Extensions
Parallel Extensions are a set of APIs provided by Microsoft, mainly designed to take advantage of systems that have multiple core and multiple processors using data parallelism, task parallelism, and the coordination on the parallel hardware.
It helps developers to take advantage of parallel hardware without worrying about scalability and performance. The best thing about it is that with the number of processors increasing, developers don’t need to re-write the code.
Extensions mainly target three areas of Concurrent and Parallel Programming:
- Declarative Data Parallelism
Declarative implies what you want to be done with data rather than how you want that to be done. LINQ introduced us to a new set of APIs based on the declarative programming model. PLINQ introduces us to a new set of APIs that applies to Declarative Data Parallelism.
- Imperative Data Parallelism
Parallel Extensions gives an API for interactive data operations such as loops. Parallel extensions introduces method like For and ForEach that have mechanisms to automatically divide the work in a loop to run on parallel hardware.
- Imperative Task Parallelism
Parallel extensions introduces the Task Parallel Library that has an API for expressing parallelism in the form of tasks rather than data. The extension schedules the task to run on parallel hardware, and provides us with ways to Cancel or Wait.
The extension includes Task Parallel Library and PLINQ built on top of a scheduler. This scheduler is a robust, efficient, and scalable engine designed to use co-operative scheduling, and work-stealing algorithms to achieve fast, efficient, and maximum CPU utilization. We can also write our own extensions on top of this scheduler.
Declarative Data Parallelism (PLINQ)
Declarative Data Parallelism is done using Parallel LINQ. PLINQ is the implementation of LINQ to Objects that executes queries in parallel, scaling to utilize the number of processors available. Executing queries with PLINQ is similar to executing queries with LINQ. PLINQ supports all query operators supported by System.Collection.Enumerable
.
However, LINQ to SQL and LINQ to Entities are not supported as the queries are supposed to be executed by the database or the query operators.
Converting your existing code in LINQ to PLINQ is very easy. You need to add a reference to the System.Threading
assembly. The next thing you need is to wrap your data source around the IParallelEnumerable<T>
interface using a call to the AsParallel
extension method.
LINQ
var q = from file in fileList
where file.Length > 10000
select file;
PLINQ
var q = from file in fileList.AsParallel()
where file.Length > 10000
select file;
Using the AsParallel
method makes sure that the compiler uses ParallelEnumerable<T>
rather than Enumerable<T>
of query interfaces. However, just like LINQ, PLINQ has deferred execution, i.e., the query does not get evaluated until a foreach
loop goes over it, or the GetEnumerator
is called, or ToList
, ToArray
, ToDictionary
methods are called. Now, this is a very important thing to take note of from the performance point of view.
So far, wrapping a data source around the IParallelEnumerable
interface just makes sure that when execution takes place, it uses the Data Parallelism algorithm. How you process the output is totally another thing. When one wraps the data source with IParallelEnumerable<T>
, the resulting collection is also of the IParallelEnumerable<T>
type. PLINQ offers the following three ways to process the output:
- Pipeline Processing Model
As the name implies, another thread is added for enumerating the query output in addition to the threads processing the data. In simple words, there would be a dedicated thread for enumeration along with several threads executing the query.
The default method of output processing is pipeline processing. So, there is nothing different done for this from LINQ.
foreach(var e in q)
{
Console.WriteLine(e);
}
One thread will manage foreach
whereas the others will execute the query .The foreach
loop uses the System.Collection.Generic.IEnumerator<T>
for iterating. Hence, when MoveNext()
is called, a set of worker threads execute the query, and results are returned from this and all the subsequent calls to MoveNext()
. IParallelEnumarable <T>
has its own GetEnumerator(bool usePipeLining)
method.
So, an explicit way of processing the output would be:
using (var e = fileGroup.GetEnumerator(true))
{
while (e.MoveNext())
{
Console.WriteLine(e);
}
}
Passing false
would be Stop and Get Processing. Pipeline processing has a benefit of reducing memory requirement for holding the result. However, it has a disadvantage that there are many producer threads and only one consumer thread, that might cause uneven distribution of work, leading to processor inefficiency.
- Stop and Get Processing Model
Stop and Get Processing, as the name implies, is that the thread that starts the enumeration stops and joins the other thread that is executing the query to execute the query. Once the complete results are obtained, it resumes iteration over the result. To use this, we have to pass false
in the GetEnumerator
method of IParallelEnumerable<T>
.
using (var e = fileGroup.GetEnumerator(false ))
{
while (e.MoveNext())
{
Console.WriteLine(e);
}
}
When the MoveNext()
is called for first time, the thread that has started the enumeration joins another thread and finishes executing the query, i.e., the query finishes executing when the first call to MoveNext()
is made and we just iterate through the result in an in-memory collection.
This model is slightly more efficient than the Pipeline model because of two reasons. Firstly, there is less synchronization overhead between the producer and consumer threads. Secondly because, PLINQ knows better where the data is going and how it can be accessed. ToArray
, ToList
sort, by default, are based on the Stop and Go model.
- Inverted Enumeration Model
In this model, a lambda function is provided to PLINQ stating what processing needs to be done. This function is passed to all the threads executing the queries along foreach
. IParallelEnumerable<T>
exposes a ForAll
method for this. Unlike the Pipeline and Stop-Go models, it does not collect results and then process them. Instead, the final function called is passed to each thread via the ForAll
extension method. This is the fastest method, but it requires that the function passed to ForAll
be thread safe and preferably free of locks and side effects.
fileGroup.ForAll(q=>Console.WriteLine(q));
Concurrent Exception
Exception in PLINQ is different from exception handling in LINQ. In LINQ, the queries are executed sequentially, so if an exception occurs, execution is aborted. However, in PLINQ, all queries are executed in parallel. Hence, PLINQ handles exceptions differently. For example:
string[] arrStr = new string[]{"1","2","3","v",
"1","2","3","'",
"1","2","3","12222222222" };
var query = from a in arrStr.AsParallel()
select int.Parse(a);
query.ForAll(q=>Console.WriteLine(q));
Every time we run the query, we will get an exception. However, since the PLINQ query is getting executed on multiple threads, we might get different exceptions. I am using “might” because it all depends on PLINQ has divided and when the exception occurred. The example above has two parsing errors and one value too large error. Since PLINQ executes multiple threads, it may be possible that I might get only two input errors, or one input and one value too large errors, or may be all. Well, actually, behind the scenes, when an exception occurs in any thread, PLINQ tries to stop all the other threads as quickly as possible. This may happen in time or not. Once all the threads are stopped, PLINQ aggregates all exceptions into a new System.Threading.AggregateException
and throws it. These exceptions are available as inner exceptions along with their stack trace. We can look at the inner exceptions and do the proper handling.
In my next article, I will discuss Imperative Data Parallelism and Imperative Task Parallelism.
Using the Sample Application
I have attached a sample application for comparing execution times of LINQ and PLINQ using different output processing. You can edit the query by taking a look at the FileInformation
class and play along it. It’s not too robust, but can easily compare LINQ and PLINQ.
Further Reading