I'm trying to fetch the actual(redirected) url from the one provided by a url shortener.
Let's take twitter url shortener for example. I'm able to get the response object also parsed it to get the document.
Response response = Jsoup.connect("http://t.co/i5dE1K4vSs")
.followRedirects(true) //to follow redirects
.execute();
Now, considering a single redirect, where to get the final url from? Any method or strategy to achieve this?
The Response object has a url() method which should give you the final url. So you could do like
String url = "http://t.co/i5dE1K4vSs";
Response response = Jsoup.connect(url).followRedirects(true).execute();
System.out.println(response.url())
If you want o get the intermediate redirects you should turn follow redirect off and then check for header "location". Eg
String url = "http://t.co/i5dE1K4vSs";
Response response = Jsoup.connect(url).followRedirects(false).execute();
System.out.println(response.header("location"));
If it has multiple redirect you need to recurssively call the urls.
Code:
String originalUrl = Jsoup.connect("http://t.co/i5dE1K4vSs")
.followRedirects(true) //to follow redirects
.execute().url().toExternalForm();
System.out.println(originalUrl);
Output:
http://ibnlive.in.com/news/messi-considered-move-to-arsenal/487799-5-21.html
Explanation:
As the Connection.Response
has Connection.Base
as superinterface, you can just use the #url() method of it (and then use the URL
object as you want.