I am in the design phase of writing a new Windows Service application that accepts TCP/IP connections for long running connections (i.e. this is not like HTTP where there are many short connections, but rather a client connects and stays connected for hours or days or even weeks).
I'm looking for ideas for the best way to design the network architecture. I'm going to need to start at least one thread for the service. I am considering using the Asynch API (BeginRecieve, etc..) since I don't know how many clients I will have connected at any given time (possibly hundreds). I definitely do not want to start a thread for each connection.
Data will primarily flow out to the clients from my server, but there will be some commands sent from the clients on occasion. This is primarily a monitoring applicaiton in which my server sends status data periodically to the clients.
Any suggestions on the best way to make this as scalable as possible? Basic workflow? Thanks.
EDIT: To be clear, i'm looking for .net based solutions (C# if possible, but any .net language will work)
BOUNTY NOTE: To be awarded the bounty, I expect more than a simple answer. I would need a working example of a solution, either as a pointer to something I could download or a short example in-line. And it must be .net and Windows based (any .net language is acceptable)
EDIT: I want to thank everyone that gave good answers. Unfortunately, I could only accept one, and I chose to accept the more well known Begin/End method. Esac's solution may well be better, but it's still new enough that I don't know for sure how it will work out.
I have upvoted all the answers I thought were good, I wish I could do more for you guys. Thanks again.
Have you considered just using a WCF net TCP binding and a publish/subscribe pattern ? WCF would allow you to focus [mostly] on your domain instead of plumbing..
There are lots of WCF samples & even a publish/subscribe framework available on IDesign's download section which may be useful : http://www.idesign.net
Using .NET's integrated Async IO (
BeginRead
, etc) is a good idea if you can get all the details right. When you properly set up your socket/file handles it will use the OS's underlying IOCP implementation, allowing your operations to complete without using any threads (or, in the worst case, using a thread that I believe comes from the kernel's IO thread pool instead of .NET's thread pool, which helps alleviate threadpool congestion.)The main gotcha is to make sure that you open your sockets/files in non-blocking mode. Most of the default convenience functions (like
File.OpenRead
) don't do this, so you'll need to write your own.One of the other main concerns is error handling - properly handling errors when writing asynchronous I/O code is much, much harder than doing it in synchronous code. It's also very easy to end up with race conditions and deadlocks even though you may not be using threads directly, so you need to be aware of this.
If possible, you should try and use a convenience library to ease the process of doing scalable asynchronous IO.
Microsoft's Concurrency Coordination Runtime is one example of a .NET library designed to ease the difficulty of doing this kind of programming. It looks great, but as I haven't used it, I can't comment on how well it would scale.
For my personal projects that need to do asynchronous network or disk I/O, I use a set of .NET concurrency/IO tools that I've built over the past year, called Squared.Task. It's inspired by libraries like imvu.task and twisted, and I've included some working examples in the repository that do network I/O. I also have used it in a few applications I've written - the largest publicly released one being NDexer (which uses it for threadless disk I/O). The library was written based on my experience with imvu.task and has a set of fairly comprehensive unit tests, so I strongly encourage you to try it out. If you have any issues with it, I'd be glad to offer you some assistance.
In my opinion, based on my experience using asynchronous/threadless IO instead of threads is a worthwhile endeavor on the .NET platform, as long as you're ready to deal with the learning curve. It allows you to avoid the scalability hassles imposed by the cost of Thread objects, and in many cases, you can completely avoid the use of locks and mutexes by making careful use of concurrency primitives like Futures/Promises.
You could try using a framework called ACE (Adaptive Communications Environment) which is a generic C++ framework for network servers. It's a very solid, mature product and is designed to support high-reliability, high-volume applications up to telco-grade.
The framework deals with quite a wide range of concurrency models and probably has one suitable for your applciation out of the box. This should make the system easier to debug as most of the nasty concurrency issues have already been sorted out. The trade-off here is that the framework is written in C++ and is not the most warm and fluffy of code bases. On the other hand, you get tested, industrial grade network infrastructure and a highly scalable architecture out of the box.
I used Kevin's solution but he says that solution lacks code for reassembly of messages. Developers can use this code for reassembly of messages:
Well, .NET sockets seem to provide select() - that's best for handling input. For output I'd have a pool of socket-writer threads listening on a work queue, accepting socket descriptor/object as part of the work item, so you don't need a thread per socket.
I would use the AcceptAsync/ConnectAsync/ReceiveAsync/SendAsync methods that were added in .Net 3.5. I have done a benchmark and they are approximately 35% faster (response time and bitrate) with 100 users constantly sending and receiving data.