Using WebBrowser in a console application

2019-01-14 13:14发布

问题:

I want to use it to invoke some JS scripts on the webpage. I have this:

    static void Stuff()
    {
        WebBrowser browser = new WebBrowser();
        browser.Navigate("http://www.iana.org/domains/example/");
        HtmlDocument doc = browser.Document;
        //doc.InvokeScript("someScript");
        Console.WriteLine(doc.ToString());
    }

    static void Main(string[] args)
    {
        Console.WriteLine("hi");
        var t = new Thread(Stuff);
        t.SetApartmentState(ApartmentState.STA);
        t.Start();
    }

Question 1: I get an "object reference not set" exception when I try to get doc.ToString(). Why?

Question 2: How do I get some data from the HTML document into the main program? WebBrowser requires a separate thread, which requires a static method which can't return any value. How do I return, say, doc to the Main() so I can do something with it?

回答1:

Right idea, wrong execution. The WebBrowser.Navigate() only tells the web browser to start navigating to the web page you asked for. That takes time, hundreds of milliseconds typically. Internet Explorer internally starts threads to get the job done. It tells you when it is done by raising the DocumentCompleted event. You don't wait for that so that's crash city first.

Next problem is that the DocumentCompleted event won't be raised in your code. You have to honor the STA contract, it requires you to pump a message loop. That's the all-mighty way that a background thread, like the one that IE uses to retrieve a web page, tells your thread that the job is done.

The boilerplate code you need is available in this answer.