Getting HTML body content in WinForms WebBrowser a

2020-01-27 08:01发布

问题:

I have a WebBrowser control in WinForms whose URL property is set to an external webpage. I also have an event handler for the DocumentCompleted event. Inside this handler, I'm trying to get specific elements, but wb.Document.Body seems to capture the HTML before onload is executed.

{System.Windows.Forms.HtmlElement}
    All: {System.Windows.Forms.HtmlElementCollection}
    CanHaveChildren: true
    Children: {System.Windows.Forms.HtmlElementCollection}
    ClientRectangle: {X = 0 Y = 0 Width = 1200 Height = 0}
    Document: {System.Windows.Forms.HtmlDocument}
    DomElement: {mshtml.HTMLBodyClass}
    ElementShim: {System.Windows.Forms.HtmlElement.HtmlElementShim}
    Enabled: true
    FirstChild: null
    htmlElement: {mshtml.HTMLBodyClass}
    Id: null
    InnerHtml: "\n"
    InnerText: null
    Name: ""
    NativeHtmlElement: {mshtml.HTMLBodyClass}
    NextSibling: null
    OffsetParent: null
    OffsetRectangle: {X = 0 Y = 0 Width = 1200 Height = 0}
    OuterHtml: "<body onload=\"evt_Login_onload(event);\" uitheme=\"Web\">\n</body>"
    OuterText: null
    Parent: {System.Windows.Forms.HtmlElement}
    ScrollLeft: 0
    ScrollRectangle: {X = 0 Y = 0 Width = 1200 Height = 0}
    ScrollTop: 0
    shimManager: {System.Windows.Forms.HtmlShimManager}
    ShimManager: {System.Windows.Forms.HtmlShimManager}
    Style: null
    TabIndex: 0
    TagName: "BODY"

"<body onload=\"evt_Login_onload(event);\" uitheme=\"Web\">\n</body>" is the pre-JavaScript content. Is there a way to capture the state of the body tag after evt_Login_onload(event); executes?

I have also tried using wb.Document.GetElementById("id"), but it returns null.

回答1:

Here is how it can be done, I've put some comments inline:

private void Form1_Load(object sender, EventArgs e)
{
    bool complete = false;
    this.webBrowser1.DocumentCompleted += delegate
    {
        if (complete)
            return;
        complete = true;
        // DocumentCompleted is fired before window.onload and body.onload
        this.webBrowser1.Document.Window.AttachEventHandler("onload", delegate
        {
            // Defer this to make sure all possible onload event handlers got fired
            System.Threading.SynchronizationContext.Current.Post(delegate 
            {
                // try webBrowser1.Document.GetElementById("id") here
                MessageBox.Show("window.onload was fired, can access DOM!");
            }, null);
        });
    };

    this.webBrowser1.Navigate("http://www.example.com");
}

Updated, it's 2019 and this answer is surprisingly still getting attention, so I'd like to note that my recommended way of doing with modern C# would be using async/await, like this.