1) A thread pool is just a convenient thread manager. It increases and decreases the number of threads it uses based on load and can re-use threads to negate some overhead. It also keeps idle threads alive to further improve performance for repeated quick-running tasks. The maximum number of threads is set on the
object. The absolute limit is based on the hardware you're running on - mostly on the stack space available. For most 32-bit systems you could probably squeeze out around 2,000.
2) Each process has a default thread pool. You can make more if you need pools with different attributes. No situation immediately comes to mind when this would be useful though. In fact
in C# because unless you know how to write your own custom thread pool you probably shouldn't be using more. If you need more idle threads you can set that on the
3-4) The global queue is for the process. The local queue is for worker threads. For example, the process queues up Task1 in its global queue which is then executed by WorkerThread1. This task then spawns Task2. This Task2 is put into WorkerThread1's local queue. When WorkerThread1 is finished it pulls the most recent (LIFO) task out of its local queue. This is because that task has the highest chance to still be cached by the worker thread. When the local queue is empty the worker thread checks the global queue.
5) The process described above isn't work-stealing. That's just normal operation. Work-stealing is when a worker thread, after checking the global queue, checks other worker threads' local queues. It pulls tasks from the end of the list first (FIFO-style) because that reduces contention with the worker thread that actually owns that local queue since the local thread is pulling from the front of the list (LIFO-style). If cache optimization actually comes into play this would also ensure that the work-stealing is taking tasks that are more likely to not be cached by the local thread that owns the queue.
Other notes: One of the big advantages of the TPL is that you don't need to know all this stuff. That being said, I can understand curiosity - that's the only reason I know this stuff, haha =)
EDIT: I forgot to answer one of your questions! The
class is what controls the global queue, local queues, and enables work-stealing.
Your picture would more accurately be:
| Process | -> Queues -> | ThreadPool |
| Global Queue | <- ThreadPool has one
| Worker | <- ThreadPool can have multiple
|| Local Queue|| <- Each Worker has one