what does urllib.request.urlopen() do?

2019-07-29 03:00发布

问题:

In python 3 does urlopen function from urllib.request module retrieve the target of the URL or just open a connection to the URL as a file handle or have i completely lost it ? I would like to understand how it works.

Basically i want to find the time taken to download a file from a URL. how do i go about it ?

Here is my code:

VERSION 1

import urllib
import time

start = time.time()
with urllib.request.urlopen('http://mirror.hactar.bz/lastsync') as f:
    lastsync = f.read() #Do i need this line if i dont care about the data
    end = time.time()
duration = end - start

VERSION 2

import urllib
import time

with urllib.request.urlopen('http://mirror.hactar.bz/lastsync') as f:
    start = time.time()
    lastsync = f.read() #Does this line do the actual data retrieval ?
    end = time.time()
duration = end - start

回答1:

From the docs:

Open the URL url, which can be either a string or a Request object.

...

This function returns a file-like object with three additional methods:

  • geturl() — return the URL of the resource retrieved, commonly used to determine if a redirect was followed
  • info() — return the meta-information of the page, such as headers, in the form of an mimetools.Message instance (see Quick Reference to HTTP Headers)
  • getcode() — return the HTTP status code of the response.

Also note that as of Python 3.0, urllib.request.urlopen() and urllib.urlopen() are equivalent.

EDIT So, to time it:

# urllib.request for < python 3.0
import urllib
import time

start = time.time()

# urllib.request.urlopen() for < python 3.0
response = urllib.urlopen('http://example.com/')
data = response.read() # a `bytes` object
end = time.time()

duration = end - start


标签: python urllib