why would does this error handling function cause

2019-09-02 09:27发布

问题:

I include this simple error handling function to format errors:

date_default_timezone_set('America/New_York');

// Create the error handler.
function my_error_handler ($e_number, $e_message, $e_file, $e_line, $e_vars) {

    // Build the error message.
    $message = "An error occurred in script '$e_file' on line $e_line: \n<br />$e_message\n<br />";

    // Add the date and time.
    $message .= "Date/Time: " . date('n-j-Y H:i:s') . "\n<br />";

    // Append $e_vars to the $message.
    $message .= "<pre>" . print_r ($e_vars, 1) . "</pre>\n<br />";

    echo '<div id="Error">' . $message . '</div><br />';

} // End of my_error_handler() definition.

// Use my error handler.
set_error_handler ('my_error_handler');

When I include it in a script in with the following

$dom = new DOMDocument();
$dom->loadHTML($output);
$xpath = new DOMXPath($dom);

and parse a web page (in this case, http://www.ssense.com/women/designers/all/all/page_1, which I do have permission to parse) I get errors like

AN ERROR OCCURRED IN SCRIPT '/HSPHERE/LOCAL/HOME/SITE.COM/SCRIPT.PHP' ON LINE 59: 
DOMDOCUMENT::LOADHTML(): HTMLPARSEENTITYREF: NO NAME IN ENTITY, LINE: 57

and

AN ERROR OCCURRED IN SCRIPT '/HSPHERE/LOCAL/HOME/SITE.COM/SCRIPT.PHP' ON LINE 59: 
DOMDOCUMENT::LOADHTML(): TAG NAV INVALID IN ENTITY, LINE: 58

There are many errors and the page never finishes loading. However, if I do not include this error handler, the line

$dom->loadHTML($output);

does not throw any errors, and I get the results I expect in a few seconds. I assume the error handler is catching warnings related to loadHTML() that are not otherwise reported. (Even if I use

@$dom->loadHTML($output);

it still reports the errors.) How might I modify the error handler to accommodate calls to loadHTML(), or otherwise fix this problem?

回答1:

It's not the custom error handler that is causing the error.

I ran the following code without a custom error handler:

$output = file_get_contents("http://www.ssense.com/women/designers/all/all/page_1");
$dom = new DOMDocument();
$dom->loadHTML($output);
$xpath = new DOMXPath($dom);

When I ran it, I got a ton of warning messages similar to the ones in your error handler.

I think the problem you're seeing is just that your error handler is reporting errors that PHP isn't reporting by default.

By default, the level of error reporting is determined by your php.ini settings, but can be overridden by using the error_reporting() function. When you set your own error handler, you have to determine for yourself what level of reporting you want to deal with. Your error handler will be called on every error and notice, and so you will output error messages for everything unless you explicitly check the error being generated against the current error_reporting() level.

Remember that using the @ error suppression operator is just shorthand for setting error_reporting(0) for that line. For example, this line:

@$dom->loadHTML($output);

Is simply shorthand for the following:

$errorLevel = error_reporting(0);
$dom->loadHTML($output);
error_reporting($errorLevel);

Since normal PHP error reporting is entirely bypassed when using a custom handler, using the @ operator is meaningless since the current error_reporting() level is completely ignored. You would have to write custom code into your error handler to check the current error_reporting() level and handle it accordingly, for example:

function my_error_handler() {
  if (error_reporting() == 0) {
    return; // do nothing when error_reporting is disabled.
  }

  // normal error handling here
}

My assumption is that, when not using a custom error handler, PHP is simply defaulting to an error_reporting() level which is lower than the errors being produced.

If you add error_reporting(E_ALL | E_STRICT); to the top of your code, you will see those same errors even when you don't have your custom error handler enabled.



回答2:

The web page you're loading contains many errors. For instance, & instead of the &amp; entity in the HTML.

PHP DOM uses libxml, so to disable all the errors insert the line:

libxml_use_internal_errors(true);

You can later get a list of the parsing errors with libxml_get_errors().