Helloes good people of Codeproject,
I have a .NET client-server application running with a few hundred of clients. The project was migrated from VB6 to .NET about a year ago and it's a platform for card/board games. Although I'll be trying to give as much detail as I can below, the problem is getting a channel frozen when there are 40-70 players inside.
1. Server (.NET 4.0)
* Divided into three projects: ServerNET, Listener, Channel
* Listener acts like a login server where clients connect first. It is responsible for checking stuff like version and account info. Also lets the client choose which channel to connect. It's basicly a TCPListener in a do-while, listening anyone trying to connect forever. It is not the reason why both sides get frozen.
* Channel represents a single port, clients get connected to Channels after they are done with Listener. Much like a space shuttle, this is the main part. Similar to a MIRC channel, it binds all users inside, most of the data is sent to people within the same channel like chat and the games you can join created by other players, hosted by server. This is a console application and serves as a hub for players. Player info is held in "Client" class which includes a TCPClient and some other properties. Each client runs with a thread and makes async calls which are handled by the server. Also these "Client" objects are held in a collection class named "ClientCollection". Channel gets frozen when there are roughly 40-70 players inside. There is a maximum limit of 100 players permitted per channel.
* ServerNET is the body and does all other general stuff related by the whole system, not channel spesific. This is a form-application and runs stuff like server options.
2. Client (.NET 2.0)
* Runs with TCPClient, mostly single thread whereas server is multi-thread.
* Must use .NET 2.0.
* Mostly consisting of visuals and other non-important stuff.
When there are 40+ clients connected to a single channel, it starts to get frozen totally randomly (or that's what we have right now, got no evidance or enough datas to point out what's totally wrong). We really don't think network traffic is the issue (not quite sure yet) since we have tried it on different server machines with various setups. All the server machines we have used are capable of handling that much of process hardware-wise. So it is about the approach and what's going on code side.
The reason why we are struggling to address the issue is we are not exactly sure what could be causing it. Please check out the following example:
System A has 55 people online in their Channel #1 and it doesn't get frozen anyhow. System A uses A1 IP and the channel is on 16xxx port.
System B has 25 people online in their Channel #4 and it gets frozen like one or two minutes randomly. System B uses B1 IP and 18xxx for the channel port. It's on the same machine with System A which doesn't get frozen.
As a conclusion, it looks irrelevant with the number of online people but it occurs more often when numbers rise.
We tried rolling an Application.DoEvents() in an endless do-loop in Channel project thinking that some X process causing the channel to go frozen state for a few minutes, thus resulting a pause in channel. Then it performs every action which was queued while it was frozen, in a few seconds. CPU usage is averagely between 7%-20% per channel, it looks like it is getting better. However it was no permanant and effective solution.
Things we suspect:
* ClientCollection that holds players and TCPClients is inherited from CollectionBase. Maybe this is causing some chaos during sync'ing. This used to be an array back in the day and we were having less of these problems. Maybe it shouldn't be inherited from CollectionBase, but something else?
* We are using SyncLock (lock in C#) to sync ClientCollection. (although we had this problem before we started using locks)
Intel Xeon X3460 2.80GHz
16 GB RAM
64-bit Windows Server 2008 Enterprise
I know it is impossible to address the issue without seeing the whole code, but I regret that I'm unable to post the codes. Instead I'm looking for an idea to put me into some direction. However we are happy to share any other info for resolving this problem.
Thanks to everyone helping!