How to get a remote-file's mtime before downlo

2020-05-28 18:24发布

问题:

I have the below code, which simply downloads a file and saves it. I want to run it every 30 seconds and check if the remote-file's mtime has changed and download it if it has. I'll be creating a thread which sleeps 30seconds after every iteration of an endless loop for that purpose, but; how do I check a remote file's mtime without downloading it?

Net::HTTP.start($xmlServerHostname) { |http|
                resp = http.get($xmlServerPath+"levels.xml")
                open("levels.xml", "w") { |file|
                    file.write(resp.body)
                }
            }

回答1:

Before you do your http.get do an http.head which requests just the headers without downloading the body (i.e. the file contents) then check if the value of the Last Modified header has changed.

e.g.

resp = http.head(($xmlServerPath+"levels.xml")
last_modified = resp['last-modified']
if last_modified != previous_last_modified
  # file has changed
end


回答2:

You can try to send the If-Modified-Since header with a correctly formatted date.

If the server supports it, it can answer just with a 304 Not Modified status (without any content) or the full content if the file has been modified.



回答3:

The official Net::HTTP 2.6.5 docs have a concrete example of If-Modified-Since which was mentioned by https://stackoverflow.com/a/1509202/895245

uri = URI('http://example.com/cached_response')
file = File.stat 'cached_response'

req = Net::HTTP::Get.new(uri)
req['If-Modified-Since'] = file.mtime.rfc2822

res = Net::HTTP.start(uri.hostname, uri.port) {|http|
  http.request(req)
}

open 'cached_response', 'w' do |io|
  io.write res.body
end if res.is_a?(Net::HTTPSuccess)

Here is a full script that actually runs:

#!/usr/bin/env ruby

require 'net/http'
require 'time'

uri = URI('https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/Illumina_iSeq_100_flow_cell_top.jpg/451px-Illumina_iSeq_100_flow_cell_top.jpg')
file_path = 'cached_response'
req = Net::HTTP::Get.new(uri)
if File.file?(file_path)
  req['If-Modified-Since'] = File.stat(file_path).mtime.rfc2822
end
res = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) {|http|
  http.request(req)
}
if res.is_a? Net::HTTPSuccess
  File.open(file_path, 'w') {|io|
    io.write res.body
  }
end

but TODO it is updating the file every time, even though Wikimedia seems to interpret If-Modified-Since: https://wikitech.wikimedia.org/wiki/MediaWiki_caching