I want to parse some HTML in order to find the values of some attributes/tags etc.
What HTML parsers do you recommend? Any pros and cons?
I want to parse some HTML in order to find the values of some attributes/tags etc.
What HTML parsers do you recommend? Any pros and cons?
NekoHTML, TagSoup, and JTidy will allow you to parse HTML and then process with XML tools, like XPath.
Do you need to do a full parse of the HTML? If you're just looking for specific values within the contents (a specific tag/param), then a simple regular expression might be enough, and could very well be faster.
I have tried HTML Parser which is dead simple.