I am using a HttpWebRequest to read in a web page using the following code:
var pageurl = new Uri(url, UriKind.Absolute);
var request = (HttpWebRequest)WebRequest.Create(pageurl);
request.Method = "GET";
request.AutomaticDecompression = DecompressionMethods.GZip;
request.KeepAlive = false;
request.ConnectionGroupName = Guid.NewGuid().ToString();
request.ServicePoint.Expect100Continue = false;
request.Pipelined = false;
request.MaximumResponseHeadersLength = 4;
if (ignoreCertificateErrors)
{
ServicePointManager.ServerCertificateValidationCallback += AcceptAllCertificatePolicy;
}
var response = (HttpWebResponse)request.GetResponse();
if (response != null)
{
using (var reader = new StreamReader(response.GetResponseStream()))
{
return reader.ReadToEnd();
}
}
This works perfectly when the language being passed is english but when another language such as spanish then I get numerous � in the returned content.
Is there a problem with the code or is there something encoding wise I am missing?
You have to specify the correct encoding for the page you're downloading to
StreamReader
. For example, if the page is in the encoding ISO-8859-2, use