Using the Python Documentation I found the HTML parser but I have no idea which library to import to use it, how do I find this out (bearing in mind it doesn't say on the page).
相关问题
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- How to get the background from multiple images by
- Evil ctypes hack in python
- Correctly parse PDF paragraphs with Python
There's a link to an example on the bottom of (http://docs.python.org/2/library/htmlparser.html) , it just doesn't work with the original python or python3. It has to be python2 as it says on the top.
I don't recommend BeautifulSoup if you want speed. lxml is much, much faster, and you can fall back in lxml's BS soupparser if the default parser doesn't work.