I need some php library to parse html content to DOM tree Like this:
html
|--head
| |---title--title_content
| |---meta--meta_content
|--body
| |---div
| | |--div--div_content
.. etc
and also repare or clean the invalid html.
ITS not only for HTML BUT event for any XML style mark-up language. basically a parent-child style.
Simple HTML DOM works great with HTML, even invalid HTML, but I am not sure how it handles XML. If you are looking for XML manipulation, the php documentation has a list of libraries.
I've just come across QueryPath in delicious, seems quite nice.
Is there any problem with PHP's built in Document Object Model extension? Sometimes a bit clunky, yes, but it's built right in and evaluates rather quickly in my experience, whereas Simple HTML DOM is (again, in my experience) prone to lots of snags and slowdowns.