C# HttpClient.SendAsync always returns 404 but URL

2020-08-22 03:06发布

I am developing an C# console application for testing whether a URL is valid or not. It works well for most of URLs. But we found that there are some cases the application always got 404 response from target site but the URLs actually work in the browser. And those URLs also works when I tried them in the tools such as DHC (Dev HTTP Client).

In the beginning, I though that this could be the reason of not adding right headers. But after tried using Fiddler to compose a http request with same headers, it works in Fiddler.

So what's wrong with my code? Is there any bug in .NET HttpClient?

Here are the simplified code of my test application:

class Program
{
    static void Main(string[] args)
    {
        var urlTester = new UrlTester("http://www.hffa.it/short-master-programs/fashion-photography");

        Console.WriteLine("Test is started");

        Task.WhenAll(urlTester.RunTestAsync());

        Console.WriteLine("Test is stoped");
        Console.ReadKey();
    }


    public class UrlTester
    {
        private HttpClient _httpClient;
        private string _url;

        public UrlTester(string url)
        {
            _httpClient = new HttpClient 
            { 
                Timeout = TimeSpan.FromMinutes(1)
            };

            // Add headers
            _httpClient.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36");
            _httpClient.DefaultRequestHeaders.Add("Accept-Encoding", "gzip,deflate,sdch");
            _httpClient.DefaultRequestHeaders.Add("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
            _httpClient.DefaultRequestHeaders.Add("Accept-Language", "sv-SE,sv;q=0.8,en-US;q=0.6,en;q=0.4");

            _url = url;
        }

        public async Task RunTestAsync()
        {
            var httpRequestMsg = new HttpRequestMessage(HttpMethod.Get, _url);

            try
            {
                using (var response = await _httpClient.SendAsync(httpRequestMsg, HttpCompletionOption.ResponseHeadersRead))
                {
                    Console.WriteLine("Response: {0}", response.StatusCode);
                }
            }
            catch (HttpRequestException e) 
            {
                Console.WriteLine(e.InnerException.Message);
            }
        }
    }

}

2条回答
【Aperson】
2楼-- · 2020-08-22 03:45

Another possible cause of this problem is if the url you are sending is over approx 2048 bytes long. At that point the content (almost certainly the query string) can become truncated and this in turn means that it may not be matched correctly with a server side route.

Although these urls were processed correctly in the browser, they also failed using the get command in power shell.

This issue was resolved by using a POST with key value pairs instead of using a GET with a long query string.

查看更多
别忘想泡老子
3楼-- · 2020-08-22 04:02

This appears to be an issue with the accepted languages. I got a 200 response when using the following Accept-Language header value

_httpClient.DefaultRequestHeaders.Add("Accept-Language", "en-GB,en-US;q=0.8,en;q=0.6,ru;q=0.4");

enter image description here

p.s. I assume you know in your example _client should read _httpClient in the urlTester constructor or it wont build.

查看更多
登录 后发表回答