ImageIO.read() returns 403 error

2019-02-24 06:37发布

问题:

I have the following code:

public BufferedImage urlToImage(String imageUrl) throws MalformedURLException, IOException {
    URL url = new URL(imageUrl);
    BufferedImage image = ImageIO.read(url);
    return image;
}

That is supposed to return an image from a given URL.

I tested with these two randomly chosen URLs:

  • https://www.google.co.ma/images/srpr/logo4w.png
  • http://www.earthtimes.org/newsimage/osteoderms-storing-minerals-helped-huge-dinosaurs-survive_3011.jpg

The first one works fine, but the second gives a 403 error:

Caused by: java.io.IOException: Server returned HTTP response code: 403 for URL: http://www.earthtimes.org/newsimage/osteoderms-storing-minerals-helped-huge-dinosaurs-survive_3011.jpg
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
at java.net.URL.openStream(URL.java:1010)
at javax.imageio.ImageIO.read(ImageIO.java:1367)

What could be the cause of the error ? Thanks.

回答1:

The ImageIO.read(URL) method opens a URL connection with pretty much all default settings, including the User-Agent property (which will be set to the JVM version you are running on). Apparently, the site you listed expects a more 'standard' UA. Testing with a straight telnet connection:

Request sent by ImageIO.read(url):

GET /newsimage/osteoderms-storing-minerals-helped-huge-dinosaurs-survive_3011.jpg HTTP/1.1
User-Agent: Java/1.7.0_17
Host: www.earthtimes.org
Accept: text/html, image/gif, image/jpeg, *; q=.2, /; q=.2
Connection: keep-alive

Response code is 404 (for me at least), with a default text/html page being returned.

Request sent by 'standard' browser:

GET /newsimage/osteoderms-storing-minerals-helped-huge-dinosaurs-survive_3011.jpg HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.65 Safari/537.31
Host: www.earthtimes.org
Accept: text/html, image/gif, image/jpeg, *; q=.2, /; q=.2
Connection: keep-alive

Response code is 200, with the image data.

The following simple fix lengthens your code, but gets around the problem, by setting a more 'standard' UA:

final String urlStr = "http://www.earthtimes.org/newsimage/osteoderms-storing-minerals-helped-huge-dinosaurs-survive_3011.jpg";
final URL url = new URL(urlStr);
final HttpURLConnection connection = (HttpURLConnection) url
        .openConnection();
connection.setRequestProperty(
    "User-Agent",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.65 Safari/537.31");
final BufferedImage image = ImageIO.read(connection.getInputStream());