How to get source code after simulate web page but

2019-08-23 11:26发布

问题:

I am trying to get a source code of a web page which I get after click on a button.

I am able to click a button located on a web page.

 webBrowser1.Navigate(url);
while (webBrowser1.ReadyState != WebBrowserReadyState.Complete)
{
    Application.DoEvents();
}               
webBrowser1.Document.GetElementById("downloadButton").InvokeMember("click");

Now after this a new window appear. Is this possible to get the source code of this new window appear after the click.

回答1:

A hacky approach would be to:

  1. Attach an event handler to the 'onclick' event of the button.
  2. Then, once the event is triggered, use the Microsoft Internet Controls (SHDocVw) type library in order to get the last URL opened in IE.
  3. Lastly, navigate to the URL and once the document is loaded, get the source of the document from the webBrowser1.DocumentText property.

In your project, add a reference to the Microsoft Internet Controls type library (you'll find it in the COM tab). Add at the top of your file:

using SHDocVw;

The code:

webBrowser1.Navigate(url);
while (webBrowser1.ReadyState != WebBrowserReadyState.Complete)
{
    Application.DoEvents();
}

// assign the button to a variable
var button = webBrowser1.Document.GetElementById("downloadButton");

// attach an event handler for the 'onclick' event of the button
button.AttachEventHandler("onclick", (a, b) =>
{
    // use the Microsoft Internet Controls COM library
    var shellWindows = new SHDocVw.ShellWindows();

    // get the location of the last window in the collection
    var newLocation = shellWindows.Cast<SHDocVw.InternetExplorer>()
        .Last().LocationURL;

    // navigate to the newLocation
    webBrowser1.Navigate(newLocation);
    while (webBrowser1.ReadyState != WebBrowserReadyState.Complete)
    {
        Application.DoEvents();
    }

    // get the document's source
    var source = webBrowser1.DocumentText;
});

button.InvokeMember("click");