I am trying to parse some data from a website to get specific items from their tables. I know that any tag with the bgcolor attribute set to #ffffff or #f4f4ff is where I want to start and my actual data sits in the 2nd within that .
Currently I have:
Private Sub runForm()
Dim theElementCollection As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("TR")
For Each curElement As HtmlElement In theElementCollection
Dim controlValue As String = curElement.GetAttribute("bgcolor").ToString
MsgBox(controlValue)
If controlValue.Equals("#f4f4ff") Or controlValue.Equals("#ffffff") Then
End If
Next
End Sub
This code gets the TR element that I need, but I have no idea how (if it is possible) to then investigate the inner elements. If not, what do you think would be the best route to take? The site does not really label any of their tables. The 's i am looking for basically look like:
<td><b><font size="2"><a href="/movie/?id=movieTitle.htm">The Movie</a></font></b></td>
I want to pull out "The Movie" text and add it to a text file.
Use the
InnerHtml
property of theHtmlElement
object (curElement
) you have, like this:Read the documentation of HtmlElement.InnerHtml Property for more information.
UPDATE:
To get the second child of the
<tr>
HTML element, use a combination ofFirstChild
and thenNextSibling
, like this: