W3WP.EXE using 100% CPU - where to start?

2020-05-11 10:15发布

问题:

An ASP.NET web app running on IIS6 periodically shoots the CPU up to 100%. It's the W3WP that's responsible for nearly all CPU usage during these episodes. The CPU stays pinned at 100% anywhere from a few minutes to over an hour.

This is on a staging server and the site is only getting very light traffic from testers at this point.

We've running ANTS profiler on the server, but it's been unenlightening.

Where can we start finding out what's causing these episodes and what code is keeping the CPU busy during all that time?

回答1:

  1. Standard Windows performance counters (look for other correlated activity, such as many GET requests, excessive network or disk I/O, etc); you can read them from code as well as from perfmon (to trigger data collection if CPU use exceeds a threshold, for example)
  2. Custom performance counters (particularly to time for off-box requests and other calls where execution time is uncertain)
  3. Load testing, using tools such as Visual Studio Team Test or WCAT
  4. If you can test on or upgrade to IIS 7, you can configure Failed Request Tracing to generate a trace if requests take more a certain amount of time
  5. Use logparser to see which requests arrived at the time of the CPU spike
  6. Code reviews / walk-throughs (in particular, look for loops that may not terminate properly, such as if an error happens, as well as locks and potential threading issues, such as the use of statics)
  7. CPU and memory profiling (can be difficult on a production system)
  8. Process Explorer
  9. Windows Resource Monitor
  10. Detailed error logging
  11. Custom trace logging, including execution time details (perhaps conditional, based on the CPU-use perf counter)
  12. Are the errors happening when the AppPool recycles? If so, it could be a clue.


回答2:

It's not much of an answer, but you might need to go old school and capture an image snapshot of the IIS process and debug it. You might also want to check out Tess Ferrandez's blog - she is a kick a** microsoft escalation engineer and her blog focuses on debugging windows ASP.NET, but the blog is relevant to windows debugging in general. If you select the ASP.NET tag (which is what I've linked to) then you'll see several items that are similar.



回答3:

If your CPU is spiking to 100% and staying there, it's quite likely that you either have a deadlock scenario or an infinite loop. A profiler seems like a good choice for finding an infinite loop. Deadlocks are much more difficult to track down, however.



回答4:

Process Explorer is an excellent tool for troubleshooting. You can try it for finding the problem of high CPU usage. It gives you an insight into the way your application works.

You can also try Procdump to dump the process and analyze what really happened on the CPU.



回答5:

Also, look at your perfmon counters. They can tell you where a lot of that cpu time is being spent. Here's a link to the most common counters to use:

  • http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/852720c8-7589-49c3-a9d1-73fdfc9126f0.mspx?mfr=true
  • http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/be425785-c1a4-432c-837c-a03345f3885e.mspx?mfr=true


回答6:

We had this on a recursive query that was dumping tons of data to the output - have you double checked everything does exit and no infinite loops exist?

Might try to narrow it down with a single page - we found ANTS to not be much help in that same case either - what we ended up doing was running the site hit a page watch the CPU - hit the next page watch CPU - very methodical and time consuming but if you cant find it with some code tracing you might be out of luck -

We were able to use IIS log files to track it to a set of pages that were suspect -

Hope that helps !



回答7:

This is a guess at best, but perhaps your development team is building and deploying the application in debug mode, in stead of release mode. This will cause the occurrence of .pdb files. The implication of this is that your application will take up additional resources to collect system state and debugging information during the execution of your system, causing more processor utilization.

So, it would be simple enough to ensure that they are building and deploying in release mode.



回答8:

This is a very old post, I know, but this is also a common problem. All of the suggested methods are very nice but they will always point to a process, and there are many chances that we already know that our site is making problems, but we just want to know what specific page is spending too much time in processing. The most precise and simple tool in my opinion is IIS itself.

  1. Just click on your server in the left pane of IIS.
  2. Click on 'Worker Processes' in the main pane. you already see what application pool is taking too much CPU.
  3. Double click on this line (eventually refresh by clicking 'Show All') to see what pages consume too much CPU time ('Time elapsed' column) in this pool


回答9:

If you identify a page that takes time to load, use SharePoint's Developer Dashboard to see which component takes time.