PHP Simple HTML Dom Memory Issue

2020-07-13 07:30发布

问题:

I'm running into memory issues with PHP Simple HTML DOM Parser. I'm parsing a fair sized doc and need to run down the DOM tree...

1)I'm starting with the whole file:

$html = file_get_html($file);

2)then parsing out my table:

$table = $html->find('table.big'); 

3)then parsing out my rows:

$rows = $table[0]->find('tr');

What I'm ending up with are three GIANT objects... anyone know how to dump an object after I've parsed it for the data I need? Like $html is useless by step 3, yet, it's the largest of all the objects.

Any ideas?

Is there a way to drill down to my table rows out of the original $html object?

Thanks in advance.

EDIT:

I've managed to skip step two with:

$rows = $this->html->find('table.big tr');

But am still running into memory issues...

回答1:

If memory is really a big concern, you may want to look into SAX instead of using DOM. You may want to try unset() on the $html after obtaining $table, but that is simply just marking it to be garbage collected and memory won't be freed up immediately.

At the end of the day, it is really up to how memory-efficient Simple HTML DOM is written or which implementation you have chosen.



回答2:

I may be little late...to answer as i joined late...so the answers given above are not correct. unset only unsets the $html not its properties. So to clean up memory and kick off the memory issue is :

use $html->clear();.

I think u didint read the class code before using it. clear() function destroy/release the memory eaten up by the $html object.This function is internal function of simple_html_dom.This function immediately take effect. So u dont have to wait whole day or program termination to take effect.



回答3:

You can increase the memory limit.

ini_set('memory_limit', '64M');

or clear the memory with this code

$html->__destruct();
unset($html);
$html = null;


回答4:

...how to dump an object after I've parsed it for the data I need? Like $html...

unset($html) ?

or $html = null; might work better - more of an immediate affect?



标签: php dom parsing