I am currently using HtmlAgility Pack to parse some HTML for a forms input tags first, then the get the name of the ID or Class and list the input and the id="something here or input: class="something here" into a RichTextbox to review.
Here is my code.
Dim web As HtmlAgilityPack.HtmlWeb = New HtmlWeb()
Dim doc As HtmlAgilityPack.HtmlDocument = web.Load(TextBox1.Text)
Dim threadLinks As IEnumerable(Of HtmlNode) = doc.DocumentNode.SelectNodes("/input")
For Each link In threadLinks
Dim str As String = link.InnerHtml
RichTextBox1.Text = str.ToString
Next link
End Sub
Here is how you can do this (note that the SelectNodes selection string was fixed):
Dim threadLinks As IEnumerable(Of HtmlNode) = doc.DocumentNode.SelectNodes("//input")
' Use a stringbuilder to hold all of the retrieved information
Dim sbText As New System.Text.StringBuilder(5000)
If threadLinks IsNot Nothing Then
For Each link In threadLinks
' Add information about each found input on a new line
sbText.Append("Id = ").Append(link.Id)
' The class is held in an attribute, so ensure the attribute exists before using it
If link.Attributes.Contains("Class") Then
' Add the value of the class attribute to the output
sbText.Append(", Class = ").Append(link.Attributes("Class").Value)
End If
' Separate this item from the next by adding a new line
sbText.AppendLine()
Next
End If
' Finally, send the retrieved information to the textbox.
RichTextBox1.Text = sbText.ToString