I am using nagios check_logwarn to capture changes to log files.
In order to test my setup, I have been manually adding the following log line to the concerned log file -
[Mon Mar 20 14:24:31 2017] [hphp] [12082:7f238d3ff700:32:000001] []
\nFatal error: entire web request took longer than 10 seconds and timed out in /var/cake_1.2.0.6311-beta
app/webroot/openx/www/delivery/postGetAd.php on line 483
The above should get caught by the following nagios command, because it contains the keyword "Fatal"
/usr/local/nagios/libexec/check_logwarn -d /tmp/logwarn_hiphop_error -p /mnt/log/hiphop/error_`(date +'%Y%m%d')`.log "^.*Fatal*"
Output (as expected) -
Log errors: \nFatal error: entire web request took longer than 10 seconds and timed out in /var/cake_1.2.
0.6311-beta
\nFatal error: entire web request took longer than 10 seconds and timed out in /var/cake_1.2.0.6311-beta
Running this command directly works (case 1), but it seems invoking the same via a PHP exec which is triggered via a Jenkins project is not catching the same (case 2).
Following is the PHP code of case 2 -
$errorLogCommand = '/usr/local/nagios/libexec/check_logwarn -d /tmp/logwarn_hiphop_error -p /mnt/log/hiphop/error_'.$date.'.log "^.*Fatal*"';
$output = exec($errorLogCommand);
file_put_contents('/var/cake_1.2.0.6311-beta/deployment/deployment.log', "\n ".date("Y-m-d H:i:s")." Checked error key words in error_".$date.".log. command -> ".$errorLogCommand, FILE_APPEND);
if($output!="OK: No log errors found")
{
file_put_contents('/var/cake_1.2.0.6311-beta/deployment/deployment.log', "\n ".date("Y-m-d H:i:s")." - Hiphop errors -> ".$output, FILE_APPEND);
$failure=true;
break;
}
else
{
file_put_contents('/var/cake_1.2.0.6311-beta/deployment/deployment.log', "\n ".date("Y-m-d H:i:s")." - No Error found -> ".$output, FILE_APPEND);
}
Following is the output -
2017-03-20 14:16:45 Checked error key words in error_20170320.log. command -> /usr/local/nagios/libexec/
check_logwarn -d /tmp/logwarn_hiphop_error -p /mnt/log/hiphop/error_20170320.log "Fatal"
2017-03-20 14:16:45 - No Error found -> OK: No log errors found
Note that with the same nagios command (/usr/local/nagios/libexec/check_logwarn
) as in case 1, log error is not detected in this case, unexpectedly.
Following are my observations of the contents of the internal tracker file which nagios generates - /tmp/logwarn_hiphop_error/mnt_log_hiphop_error_20170320.log
-
When error is detected in case 1, following are the changes in the file -
Before running command
# logwarn 1.0.10 state for "/mnt/log/hiphop/error_20170320.log"
INODENUM="1208110246"
LINENUM="110"
POSITION="111627"
MATCHING="true"
After running command
# logwarn 1.0.10 state for "/mnt/log/hiphop/error_20170320.log"
INODENUM="1208110246"
LINENUM="116"
POSITION="112087"
MATCHING="false"
Also, following are the changes to the same file in case 2 -
Before running php file
# logwarn 1.0.10 state for "/mnt/log/hiphop/error_20170320.log"
INODENUM="1208110246"
LINENUM="102"
POSITION="109329"
MATCHING="true"
After
# logwarn 1.0.10 state for "/mnt/log/hiphop/error_20170320.log"
INODENUM="1208110246"
LINENUM="110"
POSITION="111627"
MATCHING="true"
I am not sure why the MATCHING
parameter is true in the case 2, whereas in case 1 it is false. In fact, the error matching happened in case 1.
Update
I tried wrapping the command in an escapeshellcmd
, to ensure that the regex is not being stripped out -
$output = exec(escapeshellcmd($errorLogCommand));
but still no change in output.
Update 2
Found that I had line breaks in the log line I was manually adding. Removing those fixed it consistently for the case of running the PHP file from command line. However, the problem is still reproducible consistently for the case 2, where I am triggering the project via Jenkins and this file gets called in one of the hooks of AWS code deploy.
Well, it seems this is not going to get solved so easily. The problem got fixed for manual invocation of the PHP file, but on invocation via Jenkins, I am still getting the same problem consistently.
The logwarn documentation mentions support for a negative checking expression.
Please try pre-pending an exclamation mark (!) before the pattern string to exclude rather than include these matches