How to parse HTML from JavaScript in Firefox?

2019-01-09 01:28发布

What is the best way to parse (get a DOM tree of) a HTML result of XmlHttpRequest in Firefox?

EDIT:

I do not have the DOM tree, I want to acquire it.

XmlHttpRequest's "responseXML" works only when the result is actual XML, so I have only responseText to work with.

The innerHTML hack doesn't seem to work with a complete HTML document (in <html></html>). - turns out it works fine.

5条回答
仙女界的扛把子
2楼-- · 2019-01-09 01:55

At least for newer Firefox versions, an easier way is or will soon be available.

https://developer.mozilla.org/en/HTML_in_XMLHttpRequest indicates that starting from FF11 it will be possible to ask for a DOM directly from the XHR by setting the responseType attribute to "document". At that point, the HTML will be parsed and the DOM stuck into responseXML as for an XML document.

查看更多
Emotional °昔
3楼-- · 2019-01-09 01:56

Loop up the responseXML property of the XMLHttpRequest object. Furthermore, if you use innerHTML to append the responseText of an HTML-formatted response, the browser will parse the text and assemble it within the DOM all before even appending it into the document flow.

查看更多
放荡不羁爱自由
4楼-- · 2019-01-09 02:01

You can use the DOMParser to parse HTML - even tag soup:

var parser = new DOMParser()
parser.parseFromString('<!DOCTYPE html><html><head><title>hi</title></head><body><p>hello<b>world</b></p>', 'text/html')

I don't know if it handles partial table markup well, but it should create the same DOM the browser itself does for pretty much any markup.

查看更多
萌系小妹纸
5楼-- · 2019-01-09 02:16

innerHTML should work just fine, e.g.

// This would be after the Ajax request:
var myHTML = XHR.responseText;
var tempDiv = document.createElement('div');
tempDiv.innerHTML = myHTML.replace(/<script(.|\s)*?\/script>/g, '');

// tempDiv now has a DOM structure:
tempDiv.childNodes;
tempDiv.getElementsByTagName('a'); // etc. etc.
查看更多
闹够了就滚
6楼-- · 2019-01-09 02:17

If your data is XHTML, so it's valid XML, then DOMParser (Mozilla) or loadXML (IE) may help. If not, I can't think of anything better than stripping the and and then passing it to a 's innerHtml.

See 21.1.3 in Flanagan's Javascript guide (5th edition).

Colin

查看更多
登录 后发表回答