DOMDocument PHP Memory Leak

2019-01-12 00:21发布

Running PHP 5.3.6 under MAMP on MAC, the memory usage increases every x calls (between 3 and 8) until the script dies from memory exhaustion. How do I fix this?

libxml_use_internal_errors(true);
while(true){
 $dom = new DOMDocument();
 $dom->loadHTML(file_get_contents('http://www.ebay.com/'));
 unset($dom);
 echo memory_get_peak_usage(true) . '<br>'; flush();
}

4条回答
唯我独甜
2楼-- · 2019-01-12 00:56

Testing your script locally produces the same result. Changing file_get_contents() to a local HTML file however produces a consistent memory usage. It could be that the output from ebay.com is changing every X calls.

查看更多
乱世女痞
3楼-- · 2019-01-12 01:05

Based on @Tak answer and @FrancisAvila comment, I found that this snippet works better for me:

while (true)
{
    $dom = new DOMDocument();

    if (libxml_use_internal_errors(true) === true) // previous setting was true?
    {
        libxml_clear_errors();
    }

    $dom->loadHTML(file_get_contents('ebay.html'));
}

print_r(libxml_get_errors()); // errors from the last iteration are accessible

This has the added benefits of 1) not discarding the errors of the last parse if you ever need to access them via libxml_get_errors(), and 2) calling libxml_clear_errors() only when necessary, since libxml_use_internal_errors() returns the previous setting state.

查看更多
啃猪蹄的小仙女
4楼-- · 2019-01-12 01:06

Using libxml_use_internal_errors(true); suppresses error output but builds a continuous log of errors which is appended to on each loop. Either disable the internal logging and suppress PHP warnings, or clear the internal log on each loop iteration like this:

<?php
libxml_use_internal_errors(true);
while(true){
 $dom = new DOMDocument();
 $dom->loadHTML(file_get_contents('ebay.html'));
 unset($dom);
 libxml_use_internal_errors(false);
 libxml_use_internal_errors(true);
 echo memory_get_peak_usage(true) . "\r\n"; flush();
}
?>
查看更多
乱世女痞
5楼-- · 2019-01-12 01:09

You can try forcing the garbage collector to run with gc_collect_cycles(), but otherwise you're out of luck. PHP doesn't expose much of anything to control its internal memory usage, let alone memory used by a plugin library.

查看更多
登录 后发表回答