How to properly stop a multi-threaded .NET windows

2019-01-31 17:32发布

问题:

I have a windows service written in C# that creates a truck load of threads and makes many network connections (WMI, SNMP, simple TCP, http). When attempting to stop the windows service using the Services MSC snap-in, the call to stop the service returns relatively quickly but the process continues to run for about 30 seconds or so.

The primary question is what could be the reason that it is taking 30+ seconds to stop. What can I look for and how do I go about looking for it?

The secondary question is why is the service msc snap-in (service controller) returning even though the process is still running. Is there a way to get it to only return when the process is actually killed?

Here is the code in the OnStop method of the service

protected override void OnStop()
{
   //doing some tracing
   //......

   //doing some minor single threaded cleanup here
   //......

   base.OnStop();

   //doing some tracing here
}

Edit in response to Thread cleanup answers

Many of you have answered that I should keep track of all my threads and then clean them up. I don't think that is a practical approach. Firstly, i don't have access to all managed threads in one location. The software is pretty big with different components, projects and even 3rd party dlls that could all be creating threads. There is no way I can keep track of all of them in one location or have a flag that all threads check (even if i could have all threads check a flag, many threads are blocking on things like semaphores. When they are blocking they can't check. I will have to make them wait with a timeout, then check this global flag and the wait again).

The IsBackround flag is an interesting thing to check. Again though, how can I find out if I have any forground threads running arround? I will have to check every section of the code that creates a thread. Is there any other way, maybe a tool that can help me find this out.

Ultimately though, the process does stop. It would only seem that i need to wait for something. However, if i wait in the OnStop method for X ammount of time, then it takes the process approximately 30 seconds + X to stop. No matter what i try to do, it seems that the process needs approximately 30 seconds (its not always 30 seconds, it can vary) after the OnStop returns for the process to actually stop.

回答1:

The call to stop the service returns as soon as your OnStop() callback returns. Based on what you've shown, your OnStop() method doesn't do much, which explains why it returns so fast.

There are a couple of ways to cause your service to exit.

First, you can rework the OnStop() method to signal all the threads to close and wait for them to close before exiting. As @DSO suggested, you could use a global bool flag to do this (make sure to mark it as volatile). I generally use a ManualResetEvent, but either would work. Signal the threads to exit. Then join the threads with some kind of timeout period (I usually use 3000 milliseconds). If the threads still haven't exited by then, you can call the Abort() method to exit them. Generally, Abort() method is frowned upon, but given that your process is exiting anyway, it's not a big deal. If you consistently have a thread that has to be aborted, you can rework that thread to be more responsive to your shutdown signal.

Second, mark your threads as background threads (see here for more details). It sounds like you are using the System.Threading.Thread class for threads, which are foreground threads by default. Doing this will make sure that the threads do not hold up the process from exiting. This will work fine if you are executing managed code only. If you have a thread that is waiting on unmanaged code, I'm not sure if setting the IsBackground property will still cause the thread to exit automatically on shutdown, i.e., you may still have rework your threading model to make this thread respond to your shutdown request.



回答2:

The service control manager (SCM) will return when you return from OnStop. So you need to fix your OnStop implementation to block until all the threads have finished.

The general approach is to have OnStop signal all your threads to stop, and then wait for them to stop. To avoid blocking indefinitely you can give the threads a time limit to stop, then abort them if they take too long.

Here is what I've done in the past:

  1. Create a global bool flag called Stop, set to false when the service is started.
  2. When OnStop method is called, set the Stop flag to true then do a Thread.Join on all the outstanding worker threads.
  3. Each worker thread is responsible for checking the Stop flag, and exit cleanly when it is true. This check should be done frequently, and always before a long running operation, to avoid having it delay the service shutdown for too long.
  4. In the OnStop method, also have a timeout on the Join calls, to give the threads a limited time to exit cleanly... after which you just abort it.

Note in #4 you should give adequate time for your threads to exit in normal case. Abort should only happen in unusual case where thread is hung... in that case doing an abort is no worse than if the user or system kills the process (the latter if the computer is shutting down).



回答3:

The simple way to do this may look like this:
-first crete an global event

ManualResetEvent shutdownEvent;

-at service start create the manual reset event and set it to an initial state of unsignaled

shutdownEvent = new ManualResetEvent(false);

-at service stop event

shutdownEvent.Set();

do not forget to wait for the end of the threads

do
{
 //send message for Service Manager to get more time
 //control how long you wait for threads stop
}
while ( not_all_threads_stopped );

-each thread must test from time to time, the event to stop

if ( shutdownEvent.WaitOne(delay, true) ) break;


回答4:

Signal your threads loop exit, do it clean and do thread Join-s.. look for how long it takes as a measure/stopwatch where the problems are. Avoid abortive shutdown for various reasons..



回答5:

To answer the first question (Why would the service continue to run for 30+ seconds): there are many reasons. For instance, when using WCF, stopping a the Host causes the process to stop accepting incoming requests, and it waits to process all current requests before stopping.

The same would hold true for may other types of network operations: the operations would attempt to complete before terminating. This is why most network requests have a built-in timeout value for when the request may have "hung" (server gone down, network problems, etc).

Without more information on what exactly it is you are doing there is not way to tell you specifically why it's taking 30 seconds, but it's probably a timeout.

To answer the second question (Why is the service controller returning): I'm not sure. I know that the ServiceController class has a WaitForState method that allows you to wait untill the given state is reached. It is possible that the service controller is waiting for a predetermined time (another timeout) and then forcibly terminating your application.

It is also very possible that the base.OnStop method has been called, and the OnStop method has returned, signalling to the ServiceController that the process has stopped, when in fact there are some threads that have not stopped. you are responsible for termingating these threads.



回答6:

For people who look, like me, for a solution to shorter the closing time, try to set the CloseTimeout of your ServiceHost.

Now I'm trying to understand why it takes so much time to stop without it and I also think it's threads problem. I did look in Visual Studio, attaching to the service and stopping it : I have some threads launched by my service that are still running.

Now the question is : Is it really these threads that make my service stop so slowly ? Didn't Microsoft think about it ? Don't you think it can be a port releasing problem or something else ? Because it's a waste of time to handle threads sto and finally don't have a shorter closing time.



回答7:

Matt Davis is pretty complete.
A few points; If you have a thread that runs forever (because it has has a near-infinite loop and a catch all) and your service 's job is to run that thread, you probably want it to be a foreground thread.

Also, if any of your tasks are performing a longer operation such as a sproc call and so your Join timeout needs to be a little longer, you can actually asked the SCM for more time to shut down. See: https://msdn.microsoft.com/en-us/library/system.serviceprocess.servicebase.requestadditionaltime(v=vs.110).aspx This can be useful in avoiding the dreaded "marked for deletion" status. The maximum is set in the registry, so I usually request the max expected time the thread usually shuts down in (and never more than 12s). See: what is the maximum time windows service wait to process stop request and how to request for additional time

My code looks something like:

private Thread _worker;       
private readonly CancellationTokenSource _cts = new CancellationTokenSource(); 

protected override void OnStart(string[] args)
{
    _worker = new Thread(() => ProcessBatch(_cts.Token));
    _worker.Start();             
}

protected override void OnStop()
{            
    RequestAdditionalTime(4000);
    _cts.Cancel();            
    if(_worker != null && _worker.IsAlive)
        if(!_worker.Join(3000))
            _worker.Abort(); 
}

private void ProcessBatch(CancellationToken cancelToken)
{
   while (true)
   {
       try
       {
           if(cancelToken.IsCancellationRequested)
                return;               
           // Do work
           if(cancelToken.IsCancellationRequested)
                return;
           // Do more work
           if(cancelToken.IsCancellationRequested)
                return;
           // Do even more work
       }
       catch(Exception ex)
       {
           // Log it
       }
   }
}