PHP-FPM pool status of process last-request-cpu

2019-07-23 00:43发布

问题:

I have installed a PHP and enable the FPM function, but i feel uncertain about the FPM status data(like the process last-request-cpu), below is my php-fpm.conf detail.

[www]
; Unix user/group of processes
user = www-data
group = www-data

; Chdir to this directory at the start.
chdir = /

; The address on which to accept FastCGI requests.
listen = /var/run/phpfpm/$pool_php5-fpm.sock

; Set listen(2) backlog. A value of '-1' means unlimited.
listen.backlog = -1

; Set permissions for unix socket.
listen.mode = 0666

; Pool configuration.
pm = dynamic
pm.max_children = 10
pm.start_servers = 4
pm.min_spare_servers = 2
pm.max_spare_servers = 6
pm.max_requests = 500

; The URI to view the FPM status page.
pm.status_path = /status

; The ping URI to call the monitoring page of FPM.
ping.path = /ping

; The access log file.
access.log = /var/log/phpfpm/$pool_php-fpm.access.log

; The access log format.
access.format = %R - %u %t "%m %r%Q%q" %s %f %{mili}d %{kilo}M %C%%

; The log file for slow requests.
slowlog = /var/log/phpfpm/$pool_php-fpm.log.slow

; The timeout for serving a single request after which a PHP backtrace will be
; dumped to the 'slowlog' file. A value of '0s' means 'off'.
request_slowlog_timeout = 5

; Limits the extensions of the main script FPM will allow to parse.
security.limit_extensions = .php

I have enable the pm.status_path = /status to view the FPM status result as below:

<?xml version="1.0" ?>
<status>
<pool>www</pool>
<process-manager>dynamic</process-manager>
<start-time>1418352728</start-time>
<start-since>21936</start-since>
<accepted-conn>20</accepted-conn>
<listen-queue>0</listen-queue>
<max-listen-queue>0</max-listen-queue>
<listen-queue-len>0</listen-queue-len>
<idle-processes>3</idle-processes>
<active-processes>1</active-processes>
<total-processes>4</total-processes>
<max-active-processes>1</max-active-processes>
<max-children-reached>0</max-children-reached>
<slow-requests>0</slow-requests>
<processes>
<process>
    <pid>11</pid>
    <state>Idle</state>
    <start-time>1418352728</start-time>
    <start-since>21936</start-since>
    <requests>5</requests>
    <request-duration>5391</request-duration>
    <request-method>GET</request-method>
    <request-uri>/status?xml&amp;full</request-uri>
    <content-length>0</content-length>
    <user>-</user><script>-</script>
    <last-request-cpu>0.00</last-request-cpu>
    <last-request-memory>262144</last-request-memory>
</process>
<process>
    <pid>12</pid>
    <state>Idle</state>
    <start-time>1418352728</start-time>
    <start-since>21936</start-since>
    <requests>5</requests>
    <request-duration>3365</request-duration>
    <request-method>GET</request-method>
    <request-uri>/status?xml&amp;full</request-uri>
    <content-length>0</content-length>
    <user>-</user><script>-</script>
    <last-request-cpu>297.18</last-request-cpu>
    <last-request-memory>262144</last-request-memory>
</process>
</processes>
</status>

I dont know why the element last-request-cpu value 297.18 is more than 100, i would like to know how to use it as monitored info.. Thanks

回答1:

The metric will tell which percentage of the total cpu time was used in the last request.

CPU time (or process time) is the amount of time for which a central processing unit (CPU) was used for processing instructions of a computer program or operating system, as opposed to, for example, waiting for input/output (I/O) operations or entering low-power (idle) mode. The CPU time is measured in clock ticks or seconds.

So it is not measured in milliseconds as suggested elsewhere on this page.

You can see the implementation at

  • http://lxr.php.net/xref/PHP_7_0/sapi/fpm/fpm/fpm_status.c#430

The relevant parts are this (reformatted for readability):

431    if (proc.cpu_duration.tv_sec == 0 && proc.cpu_duration.tv_usec == 0) {
432        cpu = 0.;
433    } else {
434        cpu = (proc.last_request_cpu.tms_utime 
                + proc.last_request_cpu.tms_stime 
                + proc.last_request_cpu.tms_cutime 
                + proc.last_request_cpu.tms_cstime) 
                / fpm_scoreboard_get_tick() 
                / (proc.cpu_duration.tv_sec 
                + proc.cpu_duration.tv_usec / 1000000.) 
                * 100.;
435    }

The struct members for tms proc.last_request_cpu are defined as:

  • The tms_utime structure member is the CPU time charged for the execution of user instructions of the calling process.
  • The tms_stime structure member is the CPU time charged for execution by the system on behalf of the calling process.
  • The tms_cutime structure member is the sum of the tms_utime and tms_cutime times of the child processes.
  • The tms_cstime structure member is the sum of the tms_stime and tms_cstime times of the child processes.

So this means we are adding up all possible cpu times charged in the last request. All times are measured in terms of the number of clock ticks used.

The fpm_scoreboard_get_tick function will simply return the possible number of ticks per second, e.g. how many instructions your computer can do at max per second per core.

The struct members for the timeval proc.cpu_duration are defined as:

  • time_t tv_sec: This represents the number of whole seconds of elapsed time.
  • long int tv_usec: This is the rest of the elapsed time (a fraction of a second), represented as the number of microseconds. It is always less than one million.

This is the elapsed time in seconds, including any fractions, e.g. something like 2.456435663.

The value is then multiplied by 100 to get the percentage value.

Example:

Assume our last request burned a total of 350 ticks in 5 seconds. We also assume that our maximum ticks per second is 100. If we put these numbers into the equation above, we get

 (350 / 100 / 5) * 100 = 70

This means the last request used up 70% of your available CPU time.

The reason you get values above 100% is because the value for ticks per second is unaffected by the number of cores you have, whereas proc.last_request_cpu values will return the tick count of all processes, e.g. access to a database or some other data source may happen in another process, but is directly affected by the code PHP executes. So this is taken into account here.