I'm trying to programmatically load a web page via the WebBrowser control with the intent of testing the page & it's JavaScript functions. Basically, I want to compare the HTML & JavaScript run through this control against a known output to ascertain whether there is a problem.
However, I'm having trouble simply creating and navigating the WebBrowser control. The code below is intended to load the HtmlDocument into the WebBrowser.Document property:
WebBrowser wb = new WebBrowser();
wb.AllowNavigation = true;
wb.Navigate("http://www.google.com/");
When examining the web browser's state via Intellisense after Navigate() runs, the WebBrowser.ReadyState is 'Uninitialized', WebBrowser.Document = null, and it overall appears completely unaffected by my call.
On a contextual note, I'm running this control outside of a Windows form object: I do not need to load a window or actually look at the page. Requirements dictate the need to simply execute the page's JavaScript and examine the resultant HTML.
Any suggestions are greatly appreciated, thanks!
You should handle the WebBrowser.DocumentComplete event, once that event is raised you will have the Document etc.
wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted);
private void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
WebBrowser wb = sender as WebBrowser;
// wb.Document is not null at this point
}
Here is a complete example, that I quickly did in a Windows Forms application and tested.
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
WebBrowser wb = new WebBrowser();
wb.AllowNavigation = true;
wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted);
wb.Navigate("http://www.google.com");
}
private void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
WebBrowser wb = sender as WebBrowser;
// wb.Document is not null at this point
}
}
Edit: Here is a simple version of code that runs a window from a console application. You can of course go further and expose the events to the console code etc.
using System;
using System.Windows;
using System.Windows.Forms;
namespace ConsoleApplication1
{
class Program
{
[STAThread]
static void Main(string[] args)
{
Application.Run(new BrowserWindow());
Console.ReadKey();
}
}
class BrowserWindow : Form
{
public BrowserWindow()
{
ShowInTaskbar = false;
WindowState = FormWindowState.Minimized;
Load += new EventHandler(Window_Load);
}
void Window_Load(object sender, EventArgs e)
{
WebBrowser wb = new WebBrowser();
wb.AllowNavigation = true;
wb.DocumentCompleted += wb_DocumentCompleted;
wb.Navigate("http://www.bing.com");
}
void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
Console.WriteLine("We have Bing");
}
}
}
You probably need to host the control in a parent window. You can do this without breaking requirements by simply not showing the window that hosts the browser control by moving it off screen. It might also be useful for development to "see" that it does actually load something for testing, verification etc.
So try:
// in a form's Load handler:
WebBrowser wb = new WebBrowser();
this.Controls.Add(wb);
wb.AllowNavigation = true;
wb.Navigate("http://www.google.com/");
Also check to see what other properties are set on the WebBrowser object when you instantiate it via the IDE. E.g. create a Form, drop a browser control onto it and then check the form's designer file to see what code is generated. You might be missing some key property that needs to be set. I've discovered many-an-omission in my code in this way and also learned how to properly instantiate visual objects programmatically.
P.S. If you do use a host window, it should only be visible during development. You would hide in some manner for production.
Another approach:
You could go "raw" by tryiing something like this:
System.Net.WebClient wc = new System.Net.WebClient();
System.IO.StreamReader webReader = new System.IO.StreamReader(
wc.OpenRead("http://your_website.com"));
string webPageData = webReader.ReadToEnd();
...then RegEx or parse webPageData for what you need. Or do you need the jscript in the page to actually execute? (Which should be possible with .NET 4.0)
I had this problem, and I did not realize that I had uninstalled Internet Explorer. If you have, nothing will ever happen, since the WebBrowser control only instantiates IE.
The Webbrowser control is just a wrapper around Internet Explorer.
You can set in onto an invisible Windows Forms window to completely instantiate it.