Here is my opinion: Threading is a brutally difficult topic and unfortunately I have to agree with Sergey that you don't get it. But don't be disappointed, it takes years even for the best programmers to grasp it and some (most) programmers never grasp it (and sometimes depending on their specialty, language/paradigm of their choice they may not need to know about threading at all...). It isn't enough that threading is hell difficult, you will find a lot of material/tutorial related the threading and 99% of them is terrible and ugly and uses antipatterns.
- A well written multithreaded program always joins the created threads and usually a thread is created/joined by the same "owner" thread (that is usually the main thread). I is totally pointless/needless to create a thread on one thread and then joining it on another. Its better to leave the management of thread lifecycle to a "manager" thread that is usually the main thread or in other cases you can hide the thread lifecycle management inside a thread pool object.
- Most programs need only a few threads and these thread can be created on program startup and can be joined/destroyed at program exit.
- Every multithreaded program can be written this way, if you need detached threads then your threading design is not good. Post some multithreaded problems and people will help in putting together a good design/architecture to solve your problem. If someone uses thread manipulation functions other than "crreate" and "join" then their design is also bad.
BTW, if you are using a new compiler then you can use threading classes from the std lib instead of boost. Unfortunately the std lib also contains totally unnecessary threading functions/operations that confuse people and encourage antipatterns.
EDIT: About your actual problem:
You are creating one thread per client and you simply don't know when to join the threads... Actually you should neither create nor join those threads... The following statement will be a very big game changer here:
You shouldn't have more **active**/performance critical threads than the number of cores in your cpu. Even if your client threads are not cpu grinding ones you MUST somehow limit the number of clients to serve in parallel to protect your server from blowing up in lack of resource (for example because of running out of socket handles or memory to create new threads even if they are idle). How to do this???
Just create a thread pool with a fixed number of threads at program startup and stop/destroy it at program exit. The number of threads can usually be tweaked at threadpool creation. A threadpool consists of several threads and a common job queue. Every thread does the following in a loop: they pull out a job from the queue and execute it, if the queue is empty then they wait until a job arrives. (Offtopic: I usually ask the thread pool threads to terminate by putting NUM_THREADS number of NULL jobs to the queue...).
OK, A thread pool solves how/when to create/join the threads but how can we utilize it in case of your problem: We define our job as a "ServeClient" job. You create this serve-client job on your accept (maybe the main) thread and you pass all parameters (eg: socket, remote address, ...) of the client to the constructor of this job. Then all you do on the main thread is putting this "ServeClient" job to the queue of the thread pool. At some later point a pool thread will grab out this "ServeClient" job from the queue and call its "Execute" method. Then in this execute method you can perform everything that has to be done for this client. When the client is served then as a last step the Execute method of the job can destroy the job object and the thread can return from Execute to grab out the next ServeClient job from the queue.
- OK but what to do if the servicing of the client lasts for a long time and some other ServeClient jobs are waiting in the queue and maybe time out: Then you can create more threads
- What if I have even more clients then the number of threads??? You have to draw the line somewhere... You can not accept/serve more than MAX_CLIENTS number of clients if you don't want your server to blow up as a result of too many connections/threads/resources. By the way: the multithreaded one-thread-per-client approach doesn't really work effectively after a few hundred clients, in case of thousands of clients you have to perform the network IO asynchronously (IOCP/epoll/kqueue) on 1 or a few threads (async IO doesn't need many threads and the OS does the necessary threading for you in case of IO in an optimized way for the current platform) and when a request fully arrived using async IO then you have to perform only the actual server side logic on a thread pool in order to avoid blocking the async IO thread.... There are so many countless setups to solve different scenarios that it is impossible to list all of them. Start writing servers and after a few ten servers you will start to grasp what I was talking about.
For now limit the max number of clients for example to 100. If you want to be able to serve 100 clients in parallel then create a threadpool with a max thread count of 100. A normal nice healthy thread pool has the following operations available:
Any more operations are redundant. Don't use other thread pool operations, everything can be solved with this and this threadpool interface can be implemented even on the dumbest platform that supports threads in any way. If you need any other operations then it simply means that your design is bad. The same is true for thread operations: If you need something else then create or join then you are doing something in a wrong way. You will see a lot of redundant ugly/evil functions in threading apis (like std::thread/pthread implementations/winapi), for example: TerminateThread, "trylock", ... All these are totally unnecessary and sometimes very dangerous function calls in case of an application and they are just evil because of making attractive some antipatterns and solutions that look good at first glance, maybe even at second glance for the unexperienced. If you cant solve a problem with basic threading operations then ask for a solution for your problem on a forum. Normall you should be able to solve any problems just with thread pools and by placing jobs into thread pools. In worst case you need 2 thread pools: One that performs short executing jobs on at most as many thread as the number of cores, and additionally another thread pool that runs long running operations, optionally this thread pool can be a special one that creates a thread only when you add a long running job into the pool...
In my opinion you will never need to increase the number of active threads in a pool, that is simply useless. You can actually decrease the number of active threads temporarily by N by putting N number of blocking jobs into the queue of the pool and what these special jobs do is blocking the thread until you set an event in the job...
What to do if you don't want to precreate 100 threads for 100 clients? Then you can still write a special threadpool that creates a thread only when you call AddJob() but in my opinion this is not a good idea. Its usually better to precreate threads at startup and keep them till program exit. Thread creation is an expensive operation (especially if you don't limit the stack sizes) and usually impossible to handle gracefully if it fails at the middle of program (for example server program) execution. Its better to fail this way either at program startup or never...
I have 100 threads and I'm just placing ServeClient jobs into the threadpool queue as connections arrive... How to limit the number of accepted clients if I know that the servicing of a client takes long on a thread and it has no point of putting any more ServeClient jobs to the queue???
In this case you can manage a counter in the destructor of ServeClient jobs and in your client acceptor (main) thread. You are just incrementing this counter atomically (InterlockedIncrement/Decrement or std atomic stuff). In the destructor of your ServeClient job you are just decrementing this counter atomically. On your main thread you are just incrementing it. The InterlockedIncrement operation always returns the result of the increment. On your main thread when you accept a connection you increment this counter and if the result is >100 then you just close the socket (maybe send a too many clients message if you are doing it gracefully) and atomic decrement back the counter and accept the next client. (You can also sleep while you have 100 active clients with a bit more complex solution...). If your main thread increments the counter and the result is <=100 then just create the ServeClient job for the connection and put it to the queue.
Do you get it now?
EDIT #2: Detached threads: Using detached threads the way you are using them (creating them and letting them go into the wild without supervision...) is definitely a wrong solution. I can still tell you a case where I use them. I mentioned you that sometimes I use a special threadpool that creates a thread for every job exactly when you add the job to the pool and the new thread executes the job and then terminates/disappears by itself. Besides this when you try to terminate/delete the threadpool itself you must be able to join/wait all currently running threads (in the StopAndJoinAllThreads() method of the threadpool).
Note: If you create a joinable posix thread (including the std::thread that has this same rule) then you MUST join it exactly once. If you don't do this then you leak a small piece of memory that holds the exitcode (and maybe some other info) of the thread. Note that in a threadpool where the thread terminates by itself after executing the job the thread cant simply join itself to avoid this leak and you don't have another thread in the pool that could do this thread-handle cleanup (ok, some dump threadpool implementations have this so called "manager thread", avoid such dumb implementations with a "manager thread"). For this reason it is better to create this thread as detached by default (with pthread api you can immediately create it as detached, with std::thread and winapi you can detach/close the handle after thread creation). OK, but now how to handle the case when the main thread wants to delete the thread pools before program exit? It must somehow join the still running threads in the special thread pool! For this reason you can keep an atomic counter in the threadpool that is initialized to zero along with a triggerable event set to false by default. You must also make sure that when you exit your main thread there are no alive threads that are using the threadpool when you are about to call StopAndJoinAllThreads() on the thread pool.
What your threadpool.StopAndJoin() does is the following:
- AtomicDecrement() the counter and if the result is -1 then there are not running threads.
- If the result of AtomicDecrement() is zero or more than there are at least one threads that are still running. In this case you have to wait for the event to be fired.
Every time a new job is added you have to do the following:
- AtomicIncrement() the counter, create the thread and give it the job.
On thread exit the thread should do the following before actually returning from its low level thread function:
- AtomicDecrement() the counter and if it is -1 then it means that threadpool.StopAndJoin() is wating and this is actually the last thread that terminates so lets set the event object on which StopAndJoin() is waiting.
As you see here I used a detached thread but I couldn't tell you another scenario where I'm actually utilizing detached threads and even in this case they are hidden inside a thread pool and actually all of them are guaranteed to be joined before threadpool destruction.
Some other bad news came to my mind about dumb threadpools with manager threads: The threadpools that usually come with a manager threads are usually make use of this special manager thread not because of handle cleanup like in case of the previous problem but because they are trying to create an overcomplicated hyper-super-all-in-one general threadpool that can actually execute both short and long running jobs. In this case you have a problem that if a user tries to add a lot of long running threads then the user can reach the threadcount with long running jobs and this causes some short running jobs in the queue to "starve/timeout". For this reason the super-intelligent manager thread detects this (actually it detects that no thread has finished in the last x milliseconds) and it spawns some new threads. Why is this bad? As I said a thraedpool has to know only these: Create(maxthreads), addjob(job), StopAndJoin(). Thread programming is complicated enough even with this little functionality, putting a lot of super-intelligent AI code with manager threads into a threadpool increases complexity a lot and almost all of these implementations suffer of race conditions and bugs that will be found by noone because those who know what thread programming is will never look at those implementations and those who chose them are actually choosing them because they dont know much about multithreading. OK, how to solve the previous problem elegantly. As I told you in such a scenario you have to use two pools: one for short tasks (with at most numcores number of threads) because short task pools usually grind the cores and you should have a threadpool for long tasks - in special cases you can have even more if you have a good reason. You don't need artificial intelligence to decide when to create a thread. You can decide what jobs are short and long when you are writing your program. If you cant then: 1.) dont write threaded code. 2.) learn multithreading before writing multithreaded code in a production env.