How to get a particular link from a specific class

2019-07-10 05:00发布

问题:

I'd like to extract this href from that particular class

<tr class="even">
    <td>
        <a href="/italy/serie-a-2015-2016/">Serie A 2015/2016</a>
    </td>

This is what I wrote:

Sub ExtractHrefClass()

    Dim ie As Object
    Dim doc As HTMLDocument
    Dim class As Object
    Dim href As Object

    Set ie = CreateObject("InternetExplorer.Application")
    ie.Visible = True
    ie.navigate Range("D8")
    Do
        DoEvents
    Loop Until ie.readyState = READYSTATE_COMPLETE
    Set doc = ie.document
    Set class = doc.getElementsByClassName("even")
    Set href = class.getElementsByTagName("a")
    Range("E8").Value = href
    ie.Quit

End Sub

But unfortunately there is a mistake Object doesn't support this property or method (Error 438) on the line:

    Set href = class.getElementsByTagName("a")

UPDATE 1

I modified the code as per @RyszardJędraszyk answer, but no output come out O_o Where am I doing wrong?

Sub ExtractHrefClass()

    Dim ie As Object
    Dim doc As HTMLDocument
    Dim href As Object
    Dim htmlEle As Object

    Set ie = CreateObject("InternetExplorer.Application")
    ie.Visible = True
    ie.navigate Range("D8")
    Do
        DoEvents
    Loop Until ie.readyState = READYSTATE_COMPLETE And ie.Busy = False
    Set doc = ie.document
    Set href = doc.getElementsByTagName("a")
    For Each htmlEle In href
        If htmlEle.className = "even" Then
            Range("E8").Value = htmlEle
        End If
    Next
    ie.Quit

End Sub

UPDATE 2

As @dee requested in comment, there is the code from the web page http://www.soccer24.com/italy/serie-a/archive/

<tbody>
    <tr>
        <td>
            <a href="/italy/serie-a/">Serie A 2016/2017</a>
        </td>
        <td></td>
    </tr>
    <tr class="even">
        <td>
            <a href="/italy/serie-a-2015-2016/">Serie A 2015/2016</a>
        </td>
        <td>
            <span class="team-logo" style="background-image: url(/res/image/data/UZbZIMhM-bsGsveSt.png)"></span><a href="/team/juventus/C06aJvIB/">Juventus</a>
        </td>
    </tr>
    <tr>
        <td>
            <a href="/italy/serie-a-2014-2015/">Serie A 2014/2015</a>
        </td>
        <td>
            <span class="team-logo" style="background-image: url(/res/image/data/UZbZIMhM-bsGsveSt.png)"></span><a href="/team/juventus/C06aJvIB/">Juventus</a>
        </td>
    </tr>

I need only to extract that line: /italy/serie-a-2015-2016/

回答1:

This worked for me:

With CreateObject("MSXML2.XMLHTTP")
    .Open "GET", "http://www.soccer24.com/italy/serie-a/archive/", False
    .Send
    MsgBox Split(Split(Split(.ResponseText, "<tr class=""even"">", 2)(1), "<a href=""", 2)(1), """", 2)(0)
End With

The procedure you need might look like:

Sub ExtractHrefClass()

    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", Range("D8").Value, False
        .Send
        Range("E8").Value = Split(Split(Split(.ResponseText, "<tr class=""even"">", 2)(1), "<a href=""", 2)(1), """", 2)(0)
    End With

End Sub


回答2:

Try:

Dim href As HTMLObjectElement

Make sure that proper libraries are checked in references (Microsoft HTML Object Library).

Are you sure that doc.getElementsByClassName("even") works? It is not listed here: https://msdn.microsoft.com/en-us/library/aa926433.aspx as available method.

I always first use getElementsByTagName and make a condiction If htmlEle.className = "even" then.

Also add the following: ie.readyState = READYSTATE_COMPLETE and ie.busy = False. Still if it is some AJAX based website it can be not enough to determine that website has fully loaded (from the link guessing it could be flashscore.com where you need to track elements on the website informing about its loading status).



回答3:

querySelectorAll or querySelector can be used here to select the anchor elemets inside of the tr with the specific class and then with getAttribute("href") the href-attribute can be retrieved. HTH.

' Add reference to Microsoft Internet Controls (SHDocVw)
' Add reference to Microsoft HTML Object Library

Dim ie As Object
Dim name As String
Dim Doc As HTMLDocument

Set ie = New InternetExplorer
ie.Visible = 1

ie.navigate "<URL>"
While ie.Busy Or ie.readyState <> 4
    DoEvents
Wend
Set Doc = ie.document

Dim anchors As IHTMLDOMChildrenCollection
Dim anchor As IHTMLAnchorElement
Dim i As Integer

Set anchors = Doc.querySelectorAll("tr[class~='even'] a")

If Not anchors Is Nothing Then
    For i = 0 To anchors.Length - 1
        Set anchor = anchors.item(i)
        If anchor.getAttribute("href") = "/italy/serie-a-2015-2016/" Then
            Range("E8").Value = anchor.innerHTML
        End If
    Next
End If
ie.Quit