I have a highly trafficked application on one debian machine and apache has started acting strange.
Every time I start apache, tons of apache processes are spawned, the app doesn't load at all, and very quickly the whole machine freezes and must be powercycled to reboot.
Here is what I get for top immediately after starting apache:
top - 20:14:44 up 1:16, 2 users, load average: 0.48, 0.10, 0.03 Tasks: 330 total, 5 running, 325 sleeping, 0 stopped, 0 zombie Cpu(s): 12.0%us, 21.4%sy, 0.0%ni, 65.7%id, 0.2%wa, 0.1%hi, 0.7%si, 0.0%st Mem: 8179920k total, 404984k used, 7774936k free, 60716k buffers Swap: 2097136k total, 0k used, 2097136k free, 43424k cached 10251 www-data 15 0 467m 8100 4016 S 6 0.1 0:00.04 apache2 10262 www-data 15 0 467m 8092 4012 S 6 0.1 0:00.05 apache2 10360 www-data 15 0 468m 8296 4016 S 6 0.1 0:00.05 apache2 10428 www-data 15 0 468m 8272 3992 S 6 0.1 0:00.05 apache2 10241 www-data 15 0 467m 8256 4012 S 4 0.1 0:00.03 apache2 10259 www-data 15 0 467m 8092 4012 S 4 0.1 0:00.04 apache2 10274 www-data 15 0 467m 8056 4012 S 4 0.1 0:00.03 apache2 10291 www-data 15 0 468m 8292 4012 S 4 0.1 0:00.03 apache2 10293 www-data 15 0 468m 8292 4012 S 4 0.1 0:00.03 apache2 10308 www-data 15 0 468m 8296 4016 S 4 0.1 0:00.02 apache2 10317 www-data 15 0 468m 8292 4012 S 4 0.1 0:00.02 apache2 10320 www-data 15 0 468m 8292 4012 S 4 0.1 0:00.04 apache2 10325 www-data 15 0 468m 8292 4012 S 4 0.1 0:00.04 apache2
And so forth.. with more apache2 processes.
Less than a minute later, you can see below that the load has gone from 0.48 to 2.17. If I do not stop apache at this point, the load continues to rise over a few minutes or less until the machine dies.
top - 20:15:34 up 1:17, 2 users, load average: 2.17, 0.62, 0.21 Tasks: 1850 total, 5 running, 1845 sleeping, 0 stopped, 0 zombie Cpu(s): 0.3%us, 2.1%sy, 0.0%ni, 96.4%id, 0.0%wa, 0.1%hi, 1.0%si, 0.0%st Mem: 8179920k total, 1938524k used, 6241396k free, 60860k buffers Swap: 2097136k total, 0k used, 2097136k free, 44196k cached
We have a firewall where we whitelist the addresses we know are allowed to hit our site.
Any ideas about what the problem might be are very welcome.
Thanks!
Have you changed your configuration file recently? If yes, I trust you keep the old version for diffing?
If not, search for the "StartServers", "MaxSpareServers" and "MinSpareServers" directives. Generally you want to leave these at defaults, but it's possible that they were intentionally set high (bad idea) or accidentally set that way due to a bad config edit.
If this doesn't help, it's time to look outside Apache, for some process that's opening connections at a fast rate (could be that there's a testing process that's run amok).
First step is the access log. Second step is to run netstat, to see where the connections might be coming from. And if it's running on the same system, you can look in /proc/*/fd to find the two ends of the connection.
This question is ancient, but I feel compelled to add an answer here because all of the existing answers are overlooking a key piece of information from the OP: After the load has begun to rise for a few minutes,
top
reports that there are still ample CPU & memory resources available. There is usually one culprit remaining, and that's I/O.Check if there is a full partition with
df -h
. If not, see if your application is thrashing the disk usingvmstat 1 10
oriostat 1 10
(these are provided by the 'sysstat' package on Debian/Ubuntu). If you still don't see an issue there, perhaps you have device level I/O errors or network trouble for network-mounted storage. Check the system and daemon log files.Your 'top' output shows that you have plenty of free memory, so I don't think that MaxClients is an issue (unless there is some problem with Apache allocating more than 2GB of memory?) Your error log should show errors if it is having problems creating more children.
Most likely, your Apache processes really are using a lot of resources. If you are running PHP apps, try installing eAccelerator which does a good job of optimizing and caching PHP code. Other things might include heavy MySQL queries, a slow DNS resolver, etc. Beyond that, it gets more into understanding what programs are being hit and what they are doing.
You have probably made the error of configuring Apache to use far more than all of your ram. This is an easy mistake to make.
I am assuming you are using a Prefork Apache, and an in-process application server (such as PHP or mod_perl). In this model, you will end up with a maximum of (MaxClients * max memory usage of your application per process) memory used. If you don't have nearly that much, it's time to decrease one, the other or both.
In the general case, this means decreasing MaxClients to the point where your server has enough ram to cope.
The default values typically used for MaxClients (150 is typical) are not suitable for running an in-process heavyweight application server on a modest machine if you are using the Prefork model (Most application servers either don't support, or discourage, the use of threaded models).
However, decreasing MaxClients will eventually cause the application to become unavailable, particularly if you have keepalives on and the keepalive timeout too long. Processes which are just keeping a connection alive (state K in server-status) still use a lot of RAM, and that may be a problem - try to minimise keepalive timeout, or turn it off altogether.
You need to keep an eye on server-status (as provided by mod_status).
Of course you should only make ANY of these changes if you understand the consequences. Think twice, change the config once. If you have ANY ability to test the changes with simulated load on a similar spec non-production machine, do so.
As has been said (assuming Prefork Apache) - MaxClients = max processes at once.
If you find you are getting hammered with real traffic (and not a mis-configured StartServers/Min/MaxSpareServers), there are some other things you can do:
use ps -aux | grep apache to find out the number of processes that apache is running on. Look out for the "RSS" column which gives an estimate of the memory used by each process. Alternatively you can use "top", where you shift + f and then select the %MEM column to sort the processes by memory usage.
The number of processes is determined by "MaxClients" directive in your apache.conf file. The way you come to this figure is as described by this page;
sudo service apache2 stop
)The right value for "MaxClients" will ensure the right memory allocation for your apache server. That's how I solved it.
In Debian, apache conf file is at
/etc/apache2/apache2.conf