What is the best way to open a URL and get up to X

2019-07-07 00:48发布

问题:

I want to have a robot fetch a URL every hour, but if the site's operator is malicious he could have his server send me a 1 GB file. Is there a good way to limit downloading to, say, 100 KB and stop after that limit?

I can imagine writing my own connection handler from scratch, but I'd like to use urllib2 if at all possible, just specifying the limit somehow.

Thanks!

回答1:

This is probably what you're looking for:

import urllib

def download(url, bytes = 1024):
    """Copy the contents of a file from a given URL
    to a local file.
    """
    webFile = urllib.urlopen(url)
    localFile = open(url.split('/')[-1], 'w')
    localFile.write(webFile.read(bytes))
    webFile.close()
    localFile.close()