In Python, what are the differences between the urllib
, urllib2
, and requests
module? Why are there three? They seem to do the same thing...
相关问题
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- How to get the background from multiple images by
- Evil ctypes hack in python
- Correctly parse PDF paragraphs with Python
urllib and urllib2 are both Python modules that do URL request related stuff but offer different functionalities.
1) urllib2 can accept a Request object to set the headers for a URL request, urllib accepts only a URL.
2) urllib provides the urlencode method which is used for the generation of GET query strings, urllib2 doesn't have such a function. This is one of the reasons why urllib is often used along with urllib2.
Requests - Requests’ is a simple, easy-to-use HTTP library written in Python.
1) Python Requests encodes the parameters automatically so you just pass them as simple arguments, unlike in the case of urllib, where you need to use the method urllib.encode() to encode the parameters before passing them.
2) It automatically decoded the response into Unicode.
3) Requests also has far more convenient error handling.If your authentication failed, urllib2 would raise a urllib2.URLError, while Requests would return a normal response object, as expected. All you have to see if the request was successful by boolean response.ok
For example reference - https://dancallahan.info/journal/python-requests/
To get the content of a url:
It's hard to write Python2 and Python3 and
request
dependencies code for the responses because theyurlopen()
functions andrequests.get()
function return different types:urllib.request.urlopen()
returns ahttp.client.HTTPResponse
urllib.urlopen(url)
returns aninstance
request.get(url)
returns arequests.models.Response
urllib2 provides some extra functionality, namely the
urlopen()
function can allow you to specify headers (normally you'd have had to use httplib in the past, which is far more verbose.) More importantly though, urllib2 provides theRequest
class, which allows for a more declarative approach to doing a request:Note that
urlencode()
is only in urllib, not urllib2.There are also handlers for implementing more advanced URL support in urllib2. The short answer is, unless you're working with legacy code, you probably want to use the URL opener from urllib2, but you still need to import into urllib for some of the utility functions.
Bonus answer With Google App Engine, you can use any of httplib, urllib or urllib2, but all of them are just wrappers for Google's URL Fetch API. That is, you are still subject to the same limitations such as ports, protocols, and the length of the response allowed. You can use the core of the libraries as you would expect for retrieving HTTP URLs, though.
One considerable difference is about porting Python2 to Python3. urllib2 does not exist for python3 and its methods ported to urllib. So you are using that heavily and want to migrate to Python3 in future, consider using urllib. However 2to3 tool will automatically do most of the work for you.
urllib2.urlopen accepts an instance of the Request class or a url, whereas urllib.urlopen only accepts a url.
A similar discussion took place here: http://www.velocityreviews.com/forums/t326690-urllib-urllib2-what-is-the-difference.html
You should generally use urllib2, since this makes things a bit easier at times by accepting Request objects and will also raise a URLException on protocol errors. With Google App Engine though, you can't use either. You have to use the URL Fetch API that Google provides in its sandboxed Python environment.