is there any library that can give me XPATH for all the nodes in HTML page ?
相关问题
- Views base64 encoded blob in HTML with PHP
- Correctly parse PDF paragraphs with Python
- Is there a way to play audio on a mobile browser w
- HTML form is not sending $_POST values
- implementing html5 drag and drop photos with knock
In case this is helpful for someone else, if you're using python/lxml, you'll first need to have a tree, and then query that tree with the XPATH paths that Dimitre lists above.
To get the tree:
On the last line, replace '//*' with whatever xpath you want.
all_xpaths
is now a list which looks like this:Yes, if this HTML page is a well-formed XML document.
Depending on what you understand by "node"...
selects all the elements in the document.
selects all elements, text nodes, processing instructions, comment nodes, and the root node
/
.selects all text nodes in the document.
selects all comment nodes in the document.
selects all processing instructions in the document.
selects all attribute nodes in the document.
selects all namespace nodes in the document.
Finally, you can combine any of the above expressions using the union (
|
) operator.Thus, I believe that the following expression really selects "all the nodes" of any XML document: