JDOM2 - Follow Redirects (HTTP Error 301)

2019-08-13 14:17发布

问题:

I'm currently working on a third-party-program for a website using its public XML API. I don't want to go into deeper matters about what the program is actually doing or whatsoever because there seems to be a problem right at the beginning. The website's API expects a client to follow redirects and to set a proper user agent to verify the application itself, but the JDOM2 library, which I use for this project, doesn't seem to do any of these things. Neither the SAXBuilder (org.jdom2.input) integrated in the package nor the native HTTPURLConnection (java.net) class seem to do a proper job.

I'm very confused and don't know where to start at all. Is there any way to make the JDOM2 library follow redirects or am I just missing a simple method call?

回答1:

JDOM uses the URL given to the SAXBuilder to create a URL Connection, and from that connection, it opens an input stream to read the XML content.

While I understand that the HTTP protocol has a redirect functionality, that is something that is handled by the client.... consider this:

# curl -i 'http://stackoverflow.com/questions/24913206'

HTTP/1.1 301 Moved Permanently
Cache-Control: public, no-cache="Set-Cookie", max-age=60
Content-Type: text/html; charset=utf-8
Expires: Wed, 23 Jul 2014 18:44:06 GMT
Last-Modified: Wed, 23 Jul 2014 18:43:06 GMT
Location: /questions/24913206/jdom2-follow-redirects-http-error-301
Vary: *
X-Frame-Options: SAMEORIGIN
Set-Cookie: prov=xxxx.yyyy.zzzz; domain=.stackoverflow.com; expires=Fri, 01-Jan-2055 00:00:00 GMT; path=/; HttpOnly
Date: Wed, 23 Jul 2014 18:43:05 GMT
Content-Length: 174

<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="/questions/24913206/jdom2-follow-redirects-http-error-301">here</a>.</h2>
</body></html>

The data that will be given to JDOM when it builds from the URL http://stackoverflow.com/questions/24913206 will be the redirect / HTTP-301 to http://stackoverflow.com/questions/24913206/jdom2-follow-redirects-http-error-301, and the HTML content that makes that human readable.

Now, the URL handling API for Java just returns the input stream for JDOM. What you are suggesting is that JDOM should interpret that stream, and automatically redirect.

There are a few problems with this.

  • JDOM does not even know it is an HTTP URL. It is often a File name, or an FTP URL, etc.
  • what if you did not want to follow the redirect?
  • etc.

The other issue is that this should be either supported natively by Java, or actively by the application.

What are the real solutions:

  1. Tell all HTTP requests in your application to follow redirects using: HTTPUrlConnection.setFollowRedirects(true)
  2. Don't give JDOM a raw URL to build from, but process it yourself:

    URL httpurl = new URL(.....);
    HTTPURLConnection conn = (HTTPUrlConnection)httpurl.openConnection();
    conn.setInstanceFollowRedirects(true);
    conn.connect();
    Document doc = saxBuilder.build(conn.getInputStream());