python: check if url to jpg exists

2019-01-21 19:41发布

In python, how would I check if a url ending in .jpg exists?

ex: http://www.fakedomain.com/fakeImage.jpg

thanks

10条回答
我命由我不由天
2楼-- · 2019-01-21 20:07

Looks like http://www.fakedomain.com/fakeImage.jpg automatically redirected to http://www.fakedomain.com/index.html without any error.

Redirecting for 301 and 302 responses are automatically done without giving any response back to user.

Please take a look HTTPRedirectHandler, you might need to subclass it to handle that.

Here is the one sample from Dive Into Python:

http://diveintopython3.ep.io/http-web-services.html#redirects

查看更多
甜甜的少女心
3楼-- · 2019-01-21 20:08

I don't know why you are doing this, but in any case: it should be noted that just because a request to an "image" succeeds, doesn't mean it is what you think it is (it could redirect to anything, or return any data of any type, and potentially cause problems depending on what you do with the response).

Sorry, I went on a binge reading about online exploits and how to defend against them today :P

查看更多
再贱就再见
4楼-- · 2019-01-21 20:08

in Python 3.6.5:

import http.client

def exists(site, path):
    connection =  http.client.HTTPConnection(site)
    connection.request('HEAD', path)
    response = connection.getresponse()
    connection.close()
    return response.status == 200

exists("www.fakedomain.com", "/fakeImage.jpg")

In Python 3, the module httplib has been renamed to http.client

And you need remove the http:// and https:// from your URL, because the httplib is considering : as a port number and the port number must be numeric.

查看更多
唯我独甜
5楼-- · 2019-01-21 20:12
>>> import httplib
>>>
>>> def exists(site, path):
...     conn = httplib.HTTPConnection(site)
...     conn.request('HEAD', path)
...     response = conn.getresponse()
...     conn.close()
...     return response.status == 200
...
>>> exists('http://www.fakedomain.com', '/fakeImage.jpg')
False

If the status is anything other than a 200, the resource doesn't exist at the URL. This doesn't mean that it's gone altogether. If the server returns a 301 or 302, this means that the resource still exists, but at a different URL. To alter the function to handle this case, the status check line just needs to be changed to return response.status in (200, 301, 302).

查看更多
可以哭但决不认输i
6楼-- · 2019-01-21 20:13

There are problems with the previous answers when the file is in ftp server (ftp://url.com/file), the following code works when the file is in ftp, http or https:

import urllib2

def file_exists(url):
    request = urllib2.Request(url)
    request.get_method = lambda : 'HEAD'
    try:
        response = urllib2.urlopen(request)
        return True
    except:
        return False
查看更多
smile是对你的礼貌
7楼-- · 2019-01-21 20:16

thanks for all the responses everyone, ended up using the following:

try:
  f = urllib2.urlopen(urllib2.Request(url))
  deadLinkFound = False
except:
  deadLinkFound = True
查看更多
登录 后发表回答