mod_fcgid + PHP + apache lockup

2019-05-18 21:06发布

问题:

I'm running a fairly typical LAMP stack with PHP running through mod_fcgid. I'd consider the server to be under "high load" given the amount of traffic it receives.

There is an intermittent problem, where Apache is reporting all connections to be in the "Sending content" state ("W" on the monitor) when accessing sites that rely on PHP.

There are no PHP errors to speak of, its as though PHP isn't actually getting called during these "lockup" periods. However, in the apache site logs I'm seeing the following:

(103)Software caused connection abort: mod_fcgid: ap_pass_brigade failed in handle_request function
[warn] mod_fcgid: can't apply process slot for /var/www/cgi-bin/php.fcgi

During this time I can still access sites that do not depend on PHP, such as the apache status and HTML-only virtual hosts (that don't have the PHP handler include).

The php.fcgi script has PHP_FCGI_MAX_REQUESTS=500 set, because I have read there is a race condition problem with PHP running in CGI mode. The fcgid.conf also has MaxProcessCount=15 set.

Has anyone else experience this bug, and if so how can it be resolved?

回答1:

I managed to fix this one myself.

To solve this problem add in stricter checks in the FastCGI configuration for process hangs, and reduce the lifetime of your PHP instances:

IPCConnectTimeout 20
ProcessLifeTime 120
IdleTimeout 60
IdleScanInterval 30
MaxRequestsPerProcess 499
MaxProcessCount 100

Depending on your requirements, this can satisfy a well-configured server that has in excess of 50k hits per hour.

You will find the number of recorded defunct / "zombie" PHP processes increases significantly. This is good, however, as previously the processes would have simply become unresponsive and the FastCGI manager would have continued to pipe requests to them!

I would also advise removing all override directives from your php.fcgi script, as this can cause problems with your system. Try to manage as much as possible from the primary FastCGI configuration in Apache.



回答2:

We went with Nginx + http://php-fpm.org/

Try "strace -p".

I also saw lock-ups happen when some PHP software were trying to request file from the same server it's running on (get_file_contents('http://localhost...'))