可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

>>> a=urllib.urlopen('http://www.domain.com/bigvideo.avi')
>>> a.getcode()
404
>>> a=urllib.urlopen('http://www.google.com/')
>>> a.getcode()
200

My question is...bigvideo.avi is 500MB. Does my script first download the file, then check it? Or, can it immediately check the error code without saving the file?

回答1:

You want to actually tell the server not to send the full content of the file. HTTP has a mechanism for this called "HEAD" that is an alternative to "GET". It works the same way, but the server only sends you the headers, none of the actual content.

That'll save at least one of you bandwidth, while simply not doing a read() will only not bother getting the full file.

Try this:

import httplib
c = httplib.HTTPConnection(<hostname>)
c.request("HEAD", <url>)
print c.getresponse().status

The status code will be printed. Url should only be a segment, like "/foo" and hostname should be like, "www.example.com".

回答2:

Yes, it will fetch the file.

I think what you really want to do is send a HTTP HEAD request (which basically asks the server not for the data itself, but for the headers only). you can look here.