If I open a file using urllib2, like so:
remotefile = urllib2.urlopen('http://example.com/somefile.zip')
Is there an easy way to get the file name other then parsing the original URL?
EDIT: changed openfile to urlopen... not sure how that happened.
EDIT2: I ended up using:
filename = url.split('/')[-1].split('#')[0].split('?')[0]
Unless I'm mistaken, this should strip out all potential queries as well.
Using
urlsplit
is the safest option:I guess it depends what you mean by parsing. There is no way to get the filename without parsing the URL, i.e. the remote server doesn't give you a filename. However, you don't have to do much yourself, there's the
urlparse
module:Do you mean
urllib2.urlopen
? There is no function calledopenfile
in theurllib2
module.Anyway, use the
urllib2.urlparse
functions:Voila.
Using PurePosixPath which is not operating system—dependent and handles urls gracefully is the pythonic solution:
Notice how there is no network traffic here or anything (i.e. those urls don't go anywhere) - just using standard parsing rules.
This is not openfile, but maybe still helps :)
I think that "the file name" isn't a very well defined concept when it comes to http transfers. The server might (but is not required to) provide one as "content-disposition" header, you can try to get that with
remotefile.headers['Content-Disposition']
. If this fails, you probably have to parse the URI yourself.