I am developing an C# console application for testing whether a URL is valid or not. It works well for most of URLs. But we found that there are some cases the application always got 404 response from target site but the URLs actually work in the browser. And those URLs also works when I tried them in the tools such as DHC (Dev HTTP Client).
In the beginning, I though that this could be the reason of not adding right headers. But after tried using Fiddler to compose a http request with same headers, it works in Fiddler.
So what's wrong with my code? Is there any bug in .NET HttpClient?
Here are the simplified code of my test application:
class Program
{
static void Main(string[] args)
{
var urlTester = new UrlTester("http://www.hffa.it/short-master-programs/fashion-photography");
Console.WriteLine("Test is started");
Task.WhenAll(urlTester.RunTestAsync());
Console.WriteLine("Test is stoped");
Console.ReadKey();
}
public class UrlTester
{
private HttpClient _httpClient;
private string _url;
public UrlTester(string url)
{
_httpClient = new HttpClient
{
Timeout = TimeSpan.FromMinutes(1)
};
// Add headers
_httpClient.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36");
_httpClient.DefaultRequestHeaders.Add("Accept-Encoding", "gzip,deflate,sdch");
_httpClient.DefaultRequestHeaders.Add("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
_httpClient.DefaultRequestHeaders.Add("Accept-Language", "sv-SE,sv;q=0.8,en-US;q=0.6,en;q=0.4");
_url = url;
}
public async Task RunTestAsync()
{
var httpRequestMsg = new HttpRequestMessage(HttpMethod.Get, _url);
try
{
using (var response = await _httpClient.SendAsync(httpRequestMsg, HttpCompletionOption.ResponseHeadersRead))
{
Console.WriteLine("Response: {0}", response.StatusCode);
}
}
catch (HttpRequestException e)
{
Console.WriteLine(e.InnerException.Message);
}
}
}
}
Another possible cause of this problem is if the url you are sending is over approx 2048 bytes long. At that point the content (almost certainly the query string) can become truncated and this in turn means that it may not be matched correctly with a server side route.
Although these urls were processed correctly in the browser, they also failed using the get command in power shell.
This issue was resolved by using a POST with key value pairs instead of using a GET with a long query string.
This appears to be an issue with the accepted languages. I got a 200 response when using the following
Accept-Language
header valuep.s. I assume you know in your example
_client
should read_httpClient
in the urlTester constructor or it wont build.