html node parsing with ASP classic

2019-08-02 21:16发布

I stucked a day's trying to find a answer: is there a possibility with classic ASP, using MSXML2.ServerXMLHTTP.6.0 - to parse html code and extract a content of a HTML node by gived ID? For example:

remote html file:

<html>
.....
<div id="description">
some important notes here
</div>
.....
</html>

asp code

<%    
    ...
    Set objHTTP = CreateObject("MSXML2.ServerXMLHTTP.6.0")
    objHTTP.Open "GET", url_of_remote_html, False
    objHTTP.Send
    ...
%>

Now - i read a lot of docs, that there is a possibility to access HTML as source (objHTTP.responseText) and as structure (objHTTP.responseXML). But how in a world i can use that XML response to access content of that div? I read and try so many examples, but can not find anything clear that I can solve that.

1条回答
Juvenile、少年°
2楼-- · 2019-08-02 21:57

First up, perform the GET request as in your original code snippet:

Set http = CreateObject("MSXML2.ServerXMLHTTP.6.0")
http.Open "GET", url_of_remote_html, False
http.Send

Next, create a regular expression object and set the pattern to match the inner html of an element with the desired id:

Set regEx = New RegExp
regEx.Pattern = "<div id=""description"">(.*?)</div>"
regEx.Global = True

Lastly, pull out the content from the first submatch within the first match:

On Error Resume Next
contents = regEx.Execute(http.responseText)(0).Submatches(0)
On Error Goto 0

If anything goes wrong and for example the matching element isn't found in the document, contents will be Null. If all went to plan contents should hold the data you're looking for.

查看更多
登录 后发表回答