What's the best way to go about validating that a document follows some version of HTML (prefereably that I can specify)? I'd like to be able to know where the failures occur, as in a web-based validator, except in a native Python app.
相关问题
- Views base64 encoded blob in HTML with PHP
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- How to get the background from multiple images by
- Evil ctypes hack in python
I think the most elegant way it to invoke the W3C Validation Service at
programmatically. Few people know that you do not have to screen-scrape the results in order to get the results, because the service returns non-standard HTTP header paramaters
for indicating the validity and the number of errors and warnings.
For instance, the command line
returns
Thus, you can elegantly invoke the W3C Validation Service and extract the results from the HTTP header:
This is a very basic html validator based on lxml's HTMLParser. It does not require any internet connection.
Note that this will not check for closing tags, so for example, the following will pass:
However, the following wont: