Using WebClient in C# is there a way to get the UR

2019-01-06 14:03发布

Using the WebClient class I can get the title of a website easily enough:

WebClient x = new WebClient();    
string source = x.DownloadString(s);
string title = Regex.Match(source, 
    @"\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>",
    RegexOptions.IgnoreCase).Groups["Title"].Value;

I want to store the URL and the page title. However when following a link such as:

http://tinyurl.com/dbysxp

I'm clearly going to want to get the Url I'm redirected to.

QUESTIONS

Is there a way to do this using the WebClient class?

How would I do it using HttpResponse and HttpRequest?

标签: c# .net regex http
8条回答
闹够了就滚
2楼-- · 2019-01-06 14:14

I know this is already an answered question, but this works pretty to me:

 HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://tinyurl.com/dbysxp");
 request.AllowAutoRedirect = false;
 HttpWebResponse response = (HttpWebResponse)request.GetResponse();
 string redirUrl = response.Headers["Location"];
 response.Close();

 //Show the redirected url
 MessageBox.Show("You're being redirected to: "+redirUrl);

Cheers.! ;)

查看更多
Deceive 欺骗
3楼-- · 2019-01-06 14:16

With an HttpWebRequest, you would set the AllowAutoRedirect property to false. When this happens, any response with a status code between 300-399 will not be automatically redirected.

You can then get the new url from the response headers and then create a new HttpWebRequest instance to the new url.

With the WebClient class, I doubt you can change it out-of-the-box so that it does not allow redirects. What you could do is derive a class from the WebClient class and then override the GetWebRequest and the GetWebResponse methods to alter the WebRequest/WebResponse instances that the base implementation returns; if it is an HttpWebRequest, then set the AllowAutoRedirect property to false. On the response, if the status code is in the range of 300-399, then issue a new request.

However, I don't know that you can issue a new request from within the GetWebRequest/GetWebResponse methods, so it might be better to just have a loop that executes with HttpWebRequest/HttpWebResponse until all the redirects are followed.

查看更多
够拽才男人
4楼-- · 2019-01-06 14:16

In case you are only interested in the redirect URI you can use this code:

public static string GetRedirectUrl(string url)
{
     HttpWebRequest request = (HttpWebRequest) HttpWebRequest.Create(url);
     request.AllowAutoRedirect = false;

     using (HttpWebResponse response = HttpWebResponse)request.GetResponse())
     {
         return response.Headers["Location"];
     }
}

The method will return

  • null - in case of no redirect
  • a relative url - in case of a redirect

Please note: The using statement (or a final response.close()) is essential. See MSDN Library for details. Otherwise you may run out of connections or get a timeout when executing this code multiple times.

查看更多
The star\"
5楼-- · 2019-01-06 14:18

Ok this is really hackish, but the key is to use the HttpWebRequest and then set the AllowAutoRedirect property to true.

Here's a VERY hacked together example

        HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://tinyurl.com/dbysxp");
        req.Method = "GET";
        req.AllowAutoRedirect = true;
        WebResponse response = req.GetResponse();

        response.GetResponseStream();
        Stream responseStream = response.GetResponseStream();

        // Content-Length header is not trustable, but makes a good hint.
        // Responses longer than int size will throw an exception here!
        int length = (int)response.ContentLength;

        const int bufSizeMax = 65536; // max read buffer size conserves memory
        const int bufSizeMin = 8192;  // min size prevents numerous small reads

        // Use Content-Length if between bufSizeMax and bufSizeMin
        int bufSize = bufSizeMin;
        if (length > bufSize)
            bufSize = length > bufSizeMax ? bufSizeMax : length;

        StringBuilder sb;
        // Allocate buffer and StringBuilder for reading response
        byte[] buf = new byte[bufSize];
        sb = new StringBuilder(bufSize);

        // Read response stream until end
        while ((length = responseStream.Read(buf, 0, buf.Length)) != 0)
            sb.Append(Encoding.UTF8.GetString(buf, 0, length));

        string source = sb.ToString();string title = Regex.Match(source, 
        @"\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>",RegexOptions.IgnoreCase).Groups["Title"].Value;

enter code here

查看更多
Rolldiameter
6楼-- · 2019-01-06 14:30

I got the Uri for the redirected page and the page contents.

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(strUrl);
request.AllowAutoRedirect = true;

HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream dataStream = response.GetResponseStream();

strLastRedirect = response.ResponseUri.ToString();

StreamReader reader = new StreamReader(dataStream);              
string strResponse = reader.ReadToEnd();

response.Close();
查看更多
不美不萌又怎样
7楼-- · 2019-01-06 14:30

The WebClient class has an option to follow redirects. Set that option and you should be fine.

查看更多
登录 后发表回答