Verifying a remote image is actually an image file

2019-04-05 15:31发布

问题:

I'm trying to figure out how I can verify what I'm feeding into carrierwave is actually an image. The source I'm getting my image urls from isn't giving me back all live urls. Some of the images no longer exist. Unfortunately it doesn't really return the right status codes or anything because I was using some code to check if the remote file exists and it was passing that check. So now just to be on the safe side I'd like a way to verify i'm getting back a valid image file before I go ahead and download it.

Here is the remote file checking code I was using just for reference but I'd prefer something that actually can identify that the files are images.

require 'open-uri'
require 'net/http'

def remote_file_exists?(url)
    url = URI.parse(url)
    Net::HTTP.start(url.host, url.port) do |http|
      return http.head(url.request_uri).code == "200"
    end
end

回答1:

I would check to see if the service returns the proper mime types in the Content-Type HTTP header. (here's a list of mime types)

For example, the Content-Type of the StackOverflow homepage is text/html; charset=utf-8, and the Content-Type of your gravatar image is image/png

To check the Content-Type header for image in ruby using Net::HTTP, you would use the following:

def remote_file_exists?(url)
    url = URI.parse(url)
    Net::HTTP.start(url.host, url.port) do |http|
      return http.head(url.request_uri)['Content-Type'].start_with? 'image'
    end
end


回答2:

Rick Button's answer worked for me but I needed to add SSl support:

def self.remote_image_exists?(url)
  url = URI.parse(url)
  http = Net::HTTP.new(url.host, url.port)
  http.use_ssl = (url.scheme == "https")

  http.start do |http|
    return http.head(url.request_uri)['Content-Type'].start_with? 'image'
  end
end


回答3:

I ended up using HTTParty for this. The .net request answer from Rick Button kept timing out.

  def remote_file_exists?(url)
    response = HTTParty.get(url)
    response.code == 200 && response.headers['Content-Type'].start_with? 'image'
  end

https://github.com/jnunemaker/httparty