“A timeout was reached while waiting for the servi

2019-01-31 06:44发布

问题:

I have a custom-written Windows service that I run on a number of Hyper-V VMs. The VMs get rebooted a couple times an hour as part of some automated tests being run. The service is set to automatic start and almost all of the time, it starts up fine.

However, maybe 5% of the time, with no pattern that I can discern, the service fails to start. When it fails, I get an error in Event Viewer saying

A timeout was reached (30000 milliseconds) while waiting for the My Service Name service to connect.

When this occurs, I can start the service manually, or restart again, and the service will start fine.

The thing I can't figure out is that the 30 second timeout doesn't appear to be occurring in my code. The very first line of my service class's OnStart() method logs "Starting..." to its log4net log. When the service fails to start, I don't even get anything logged at all, which indicates to me that either log4net can't log for whatever reason, or the timeout is occurring before my OnStart() gets called.

The service runs on a variety of OSes, from XP all the way up to Win7 and 2008R2, and I know that setting the service to delayed start may solve this for Vista and later, but that seems like a hack.

I haven't been able to remote debug this because of the fact that it happens so intermittently and during system startup, and I'm at a loss as to further ways to try to figure out what's going on. Any ideas?

回答1:

You may want to look at this post, it's not identical to your situation but the solution does offer sound advice on Windows services and startup functionality.



回答2:

My guess - and that's all it is - is that the disk is thrashing hard during startup, to the point where the .NET Framework itself isn't starting in the 30 seconds that Windows allocates for services to start.

A kludgy workaround may be to set the service to start manually, then write a very small stub service in unmanaged code (e.g. C++, Delphi) to start the service.

Another approach may be to start the service remotely from another machine. The sc command should do the job nicely.



回答3:

For what it's worth, I discovered that I received this message (almost immediately upon service startup) because I did not have version 4.5 of the .NET framework installed on the target machine. I rolled back the version I was using to version 4.0 (which was already installed on the target machine) and the service worked as expected.



回答4:

I was seeing this error in the Event Viewer when trying to install a service with powershell.

The problem I had was that I had different values for "Service Name" and "Service Display Name" in my powershell script to those that I had specified in the program.cs file of my Console Application.



回答5:

I think I may have also found another contributing factor to this kind of does not start on reboot error.

It appears that if the Windows Event Log is set to Overwrite Events > 7days.. size 512kb.. But a lot of activity has occurred within this window, then Event Log is effectively full because it can't overwrite the number of events generated inside that timeframe. If you set the eventlog to a much larger size OR to Overwrite as needed then you won't experience this issue



回答6:

My issue with the same error was that the .Net installation on the server was not working correctly.

To figure this out:

I made a small console app with identical logic as the executing service, and I made a try-catch around the whole code piece, dumping it all out to console.

Not sure why the information didn't bubble up, but we saw the valuable messages about the Framework errors that we would never have seen otherwise.



回答7:

We are having the same problem on Windows 2016 Server.

A fix that seems to be working is changing the user under which the service running from Local Service Account to local Administrator (not sure what's the cause).