I want to get plain text using WebRequest class, just like what we get when we use webbrowser1.Document.Body.InnerText
. I have tried the following code
public string request_Resource()
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(myurl);
Stream stream = request.GetResponse().GetResponseStream();
StreamReader sr = new StreamReader(stream);
WebBrowser wb = new WebBrowser();
wb.DocumentText = sr.ReadToEnd();
return wb.Document.Body.InnerText;
}
when i execute this is get a NullReferenceException
.
Is there a better way to get a plain text.
Note: I cannot use webbrowser control directly to load the webpage, because, i don't want to deal with all those events that fire up multiple times when ever a page is loaded.
UPDATE: I have changed my code to use WebClient Class instead of WebRequest upon suggestion My code looks something like this now
public string request_Resource()
{
WebClient wc = new WebClient();
wc.Proxy = null;
//The user agent header is added to avoid any possible errors
wc.Headers.Add("user-agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.10 ( .NET CLR 3.5.30729; .NET4.0C)");
return wc.DownloadString(myurl);
}
I am considering using HTML Utility Pack, can anyone suggest any better alternative.