VB.NET webbrowser: HTML of DocumentText is inaccur

2019-09-01 06:17发布

问题:

I'm trying to read messages sent by strangers on Omegle. A random "chat with strangers" website.

I've displayed the DocumentText of my webbrowser (called Omegle) in a textbox called OmegleHTML:

 Private Sub Omegle_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles Omegle.DocumentCompleted
    OmegleHTML.Text = Omegle.DocumentText
    Me.Text = Omegle.Document.Title
End Sub

I've also did a bit of coloring to make things a bit clear:

Now using this HTML code, I've been able to do simple tasks I need such as simulating clicks. But what I'm mainly interested in like I said is extracting the string a stranger says from the HTML code, sadly I'm unable to find what I need in the HTML code I've exported to the textbox, however when I inspect the message element in Chrome:

This is the exact code I need to display in my textbox in order to extract the logitem message a stranger types, now what am I doing wrong? I noticed that when I press Ctrl + U (page source) in chrome, it displays the same exact code my textbox displays, aslo missing the logitems I need, so if I'm not looking for the page source, what should I look for?

回答1:

The content is written out dynamically using JavaScript. So it isn't part of the page source itself, but is part of the "state" of the page.

See this answer for some details. How to get rendered html (processed by Javascript) in WebBrowser control?