Download an image from a URL?

2020-02-09 01:37发布

问题:

I am trying to use HTTP::get to download an image of a Google chart from a URL I created.

This was my first attempt:

failures_url  = [title, type, data, size, colors, labels].join("&")

require 'net/http'

Net::HTTP.start("http://chart.googleapis.com") { |http|
  resp = http.get("/chart?#{failures_url")
  open("pie.png" ,"wb") { |file|
    file.write(resp.body)
  }
}

Which produced only an empty PNG file.

For my second attempt I used the value stored inside failure_url inside the http.get() call.

require 'net/http'

Net::HTTP.start("http://chart.googleapis.com") { |http|
  resp = http.get("/chart?chtt=Builds+in+the+last+12+months&cht=bvg&chd=t:296,1058,1217,1615,1200,611,2055,1663,1746,1950,2044,2781,1553&chs=800x375&chco=4466AA&chxl=0:|Jul-2010|Aug-2010|Sep-2010|Oct-2010|Nov-2010|Dec-2010|Jan-2011|Feb-2011|Mar-2011|Apr-2011|May-2011|Jun-2011|Jul-2011|2:|Months|3:|Builds&chxt=x,y,x,y&chg=0,6.6666666666666666666666666666667,5,5,0,0&chxp=3,50|2,50&chbh=23,5,30&chxr=1,0,3000&chds=0,3000")
  open("pie.png" ,"wb") { |file|
    file.write(resp.body)
  }
}

And, for some reason, this version works even though the first attempt had the same data inside the http.get() call. Does anyone know why this is?

SOLUTION:

After trying to figure why this is happening I found "How do I download a binary file over HTTP?".

One of the comments mentions removing http:// in the Net::HTTP.start(...) call otherwise it won't succeed. Sure enough after I did this:

failures_url  = [title, type, data, size, colors, labels].join("&")

require 'net/http'

Net::HTTP.start("chart.googleapis.com") { |http|
  resp = http.get("/chart?#{failures_url")
  open("pie.png" ,"wb") { |file|
    file.write(resp.body)
  }
}

it worked.

回答1:

I'd go after the file using Ruby's Open::URI:

require "open-uri"

File.open('pie.png', 'wb') do |fo|
  fo.write open("http://chart.googleapis.com/chart?#{failures_url}").read 
end

The reason I prefer Open::URI is it handles redirects automatically, so WHEN Google makes a change to their back-end and tries to redirect the URL, the code will handle it magically. It also handles timeouts and retries more gracefully if I remember right.

If you must have lower level control then I'd look at one of the many other HTTP clients for Ruby; Net::HTTP is fine for creating new services or when a client doesn't exist, but I'd use Open::URI or something besides Net::HTTP until the need presents itself.


The URL:

http://chart.googleapis.com/chart?chtt=Builds+in+the+last+12+months&cht=bvg&chd=t:296,1058,1217,1615,1200,611,2055,1663,1746,1950,2044,2781,1553&chs=800x375&chco=4466AA&chxl=0:|Jul-2010|Aug-2010|Sep-2010|Oct-2010|Nov-2010|Dec-2010|Jan-2011|Feb-2011|Mar-2011|Apr-2011|May-2011|Jun-2011|Jul-2011|2:|Months|3:|Builds&chxt=x,y,x,y&chg=0,6.6666666666666666666666666666667,5,5,0,0&chxp=3,50|2,50&chbh=23,5,30&chxr=1,0,3000&chds=0,3000

makes URI upset. I suspect it is seeing characters that should be encoded in URLs.

For documentation purposes, here is what URI says when trying to parse that URL as-is:

URI::InvalidURIError: bad URI(is not URI?)

If I encode the URI first, I get a successful parse. Testing further using Open::URI shows it is able to retrieve the document at that point and returns 23701 bytes.

I think that is the appropriate fix for the problem if some of those characters are truly not acceptable to URI AND they are out of the RFC.

Just for information, the Addressable::URI gem is a great replacement for the built-in URI.



回答2:

    resp = http.get("/chart?#{failures_url")

If you copied your original code then you're missing a closing curly bracket in your path string.



回答3:

Your original version did not have the parameter name for each parameter, just the data. For example, on the title, you cannot just submit "Builds+in+the+last+12+months", but instead it must be "chtt=Builds+in+the+last+12+months".

Try this:

failures_url  = ["title="+title, "type="+type, "data="+data, "size="+size, "colors="+colors, "labels="+labels].join("&")