I can't understand why Java's HttpURLConnection
doesn't follow redirect. I use the following code to get this page:
import java.net.URL;
import java.net.HttpURLConnection;
import java.io.InputStream;
public class Tester {
public static void main(String argv[]) throws Exception{
InputStream is = null;
try {
String bitlyUrl = "http://bit.ly/4hW294";
URL resourceUrl = new URL(bitlyUrl);
HttpURLConnection conn = (HttpURLConnection)resourceUrl.openConnection();
conn.setConnectTimeout(15000);
conn.setReadTimeout(15000);
conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; ru; rv:1.9.0.11) Gecko/2009060215 Firefox/3.0.11 (.NET CLR 3.5.30729)");
conn.connect();
is = conn.getInputStream();
String res = conn.getURL().toString();
if (res.toLowerCase().contains("bit.ly"))
System.out.println("bit.ly is after resolving: "+res);
}
catch (Exception e) {
System.out.println("error happened: "+e.toString());
}
finally {
if (is != null) is.close();
}
}
}
Moreover, I get the following response (it seems absolutely right!):
GET /4hW294 HTTP/1.1
Host: bit.ly
Connection: Keep-Alive
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; ru-RU; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 (.NET CLR 3.5.30729)
HTTP/1.1 301 Moved
Server: nginx/0.7.42
Date: Thu, 10 Dec 2009 20:28:44 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
Location: https://www.myganocafe.com/CafeMacy
MIME-Version: 1.0
Content-Length: 297
Unfortunately, the res
variable contains the same URL and stream contains the following (obviously, Java's HttpURLConnection
doesn't follow redirect!):
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML>
<HEAD>
<TITLE>Moved</TITLE>
</HEAD>
<BODY>
<H2>Moved</H2>
<A HREF="https://www.myganocafe.com/CafeMacy">The requested URL has moved here.</A>
<P ALIGN=RIGHT><SMALL><I>AOLserver/4.5.1 on http://127.0.0.1:7400</I></SMALL></P>
</BODY>
</HTML>
HTTPUrlConnection is not responsible for handling the response of the object. It is performance as expected, it grabs the content of the URL requested. It is up to you the user of the functionality to interpret the response. It is not able to read the intentions of the developer without specification.
HttpURLConnection by design won't automatically redirect from HTTP to HTTPS (or vice versa). Following the redirect may have serious security consequences. SSL (hence HTTPS) creates a session that is unique to the user. This session can be reused for multiple requests. Thus, the server can track all of the requests made from a single person. This is a weak form of identity and is exploitable. Also, the SSL handshake can ask for the client's certificate. If sent to the server, then the client's identity is given to the server.
As erickson points out, suppose the application is set up to perform client authentication automatically. The user expects to be surfing anonymously because he's using HTTP. But if his client follows HTTPS without asking, his identity is revealed to the server.
With that understood, here's the code which will follow the redirects.
As mentioned by some of you above, the setFollowRedirect and setInstanceFollowRedirects only work automatically when the redirected protocol is same . ie from http to http and https to https.
setFolloRedirect is at class level and sets this for all instances of the url connection, whereas setInstanceFollowRedirects is only for a given instance. This way we can have different behavior for different instances.
I found a very good example here http://www.mkyong.com/java/java-httpurlconnection-follow-redirect-example/
Another option can be to use Apache HttpComponents Client:
Sample code:
Has something called
HttpURLConnection.setFollowRedirects(false)
by any chance?You could always call
if you want to make sure you don't affect the rest of the behaviour of the app.
I don't think that it will automatically redirect from HTTP to HTTPS (or vice-versa).
Even though we know it mirrors HTTP, from the HTTP protocol point of view, HTTPS is just some other, completely different, unknown protocol. It would be unsafe to follow the redirect without user approval.
For example, suppose the application is set up to perform client authentication automatically. The user expects to be surfing anonymously because he's using HTTP. But if his client follows HTTPS without asking, his identity is revealed to the server.