html node parsing with ASP classic

2019-08-02 21:16发布

I stucked a day's trying to find a answer: is there a possibility with classic ASP, using MSXML2.ServerXMLHTTP.6.0 - to parse html code and extract a content of a HTML node by gived ID? For example:

remote html file:

<html>
.....
<div id="description">
some important notes here
</div>
.....
</html>

asp code

<%    
    ...
    Set objHTTP = CreateObject("MSXML2.ServerXMLHTTP.6.0")
    objHTTP.Open "GET", url_of_remote_html, False
    objHTTP.Send
    ...
%>

Now - i read a lot of docs, that there is a possibility to access HTML as source (objHTTP.responseText) and as structure (objHTTP.responseXML). But how in a world i can use that XML response to access content of that div? I read and try so many examples, but can not find anything clear that I can solve that.

标签： parsing html asp-classic

1条回答

Juvenile、少年°

2楼-- · 2019-08-02 21:57

First up, perform the GET request as in your original code snippet:

Set http = CreateObject("MSXML2.ServerXMLHTTP.6.0")
http.Open "GET", url_of_remote_html, False
http.Send

Next, create a regular expression object and set the pattern to match the inner html of an element with the desired id:

Set regEx = New RegExp
regEx.Pattern = "<div id=""description"">(.*?)</div>"
regEx.Global = True

Lastly, pull out the content from the first submatch within the first match:

On Error Resume Next
contents = regEx.Execute(http.responseText)(0).Submatches(0)
On Error Goto 0

If anything goes wrong and for example the matching element isn't found in the document, contents will be Null. If all went to plan contents should hold the data you're looking for.

0人赞添加讨论(0) 举报

html node parsing with ASP classic

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间