We have developed some long running C# console applications which will be run by Windows Scheduled tasks.
These applications might be run on many different server machines on intranet/extranet. We cannot ensure that they run on one machine because each application might need access to some resources, which are available only on a certain machine.
Still, all these applications are using a common WCF service to access the database.
We need to ensure, that there is only instance of one of our applications running at any moment. As the apps might be on different extranet computers, we cannot use per-machine mutexes or MSMQ.
I have thought about the following solution - WCF Mutex service with a timeout. When one app runs, it checks to see if it is already launched (maybe on another machine) and then (in a dedicated thread) periodically pings the WCF Mutex service to update the timestamp (if ping fails, the app exits immediately). If the timestamp gets expired, this means, that the application has crashed, so it can be run again.
I would like to know, if this "WCF mutex" is optimal solution for my problem. Maybe there are already some third party libraries which have implemented such functionality?
WCF is usually run on IIS. So when a call is made to you WCF Mutex service and the client waits for the lock to be released, until the timeout of the call, the thread in IIS that is servicing that call will essentially be blocked. This will limit the throughput on that IIS server and it will probably start erroring when more clients than it can service make requests for a lock on the mutex. Moral of the story, if you're going to use WCF, don't use it in IIS for that application or write your own server app to allow locking and unlocking of the mutex.
How about a file lock on a network location?
If you can create/open the file with exclusive read write then it is the only app running. If this running app then subsequently crashes, the lock is released automaticaly by the OS.
Tim
Oops, just re-read the question and saw "extranet", ignore me!
You mutex solution has a race condition.
If an app on a different server checks the timestamp in the window after the timestamp expired, but before the current service had updated the timestamp you will have two instances running.
I'd probably go the opposite route. I'd have a central monitoring service. This service would continually monitor the health of the system. If it detects a service went down, it would restart it on either that machine or a different one.
You may want to bite the bullet and go with a full Enterprise Service Bus. Check the Wikipedia article for ESBs. It lists over a dozen commercial and open source systems.