Python urllib2 with keep alive

2楼-- · 2019-01-03 13:14

Please avoid collective pain and use Requests instead. It will do the right thing by default and use keep-alive if applicable.

0人赞添加讨论(0) 举报

孤傲高冷的网名

3楼-- · 2019-01-03 13:17

Try urllib3 which has the following features:

Re-use the same socket connection for multiple requests (HTTPConnectionPool and HTTPSConnectionPool) (with optional client-side certificate verification).
File posting (encode_multipart_formdata).
Built-in redirection and retries (optional).
Supports gzip and deflate decoding.
Thread-safe and sanity-safe.
Small and easy to understand codebase perfect for extending and building upon. For a more comprehensive solution, have a look at Requests.

or a much more comprehensive solution - Requests - which supports keep-alive from version 0.8.0 (by using urllib3 internally) and has the following features:

Extremely simple HEAD, GET, POST, PUT, PATCH, DELETE Requests.
Gevent support for Asyncronous Requests.
Sessions with cookie persistience.
Basic, Digest, and Custom Authentication support.
Automatic form-encoding of dictionaries
A simple dictionary interface for request/response cookies.
Multipart file uploads.
Automatc decoding of Unicode, gzip, and deflate responses.
Full support for unicode URLs and domain names.

0人赞添加讨论(0) 举报

仙女界的扛把子

4楼-- · 2019-01-03 13:21

Unfortunately keepalive.py was removed from urlgrabber on 25 Sep 2009 by the following change after urlgrabber was changed to depend on pycurl (which supports keep-alive):

http://yum.baseurl.org/gitweb?p=urlgrabber.git;a=commit;h=f964aa8bdc52b29a2c137a917c72eecd4c4dda94

However, you can still get the last revision of keepalive.py here:

http://yum.baseurl.org/gitweb?p=urlgrabber.git;a=blob_plain;f=urlgrabber/keepalive.py;hb=a531cb19eb162ad7e0b62039d19259341f37f3a6

0人赞添加讨论(0) 举报

Emotional °昔

5楼-- · 2019-01-03 13:29

Or check out httplib's HTTPConnection.

0人赞添加讨论(0) 举报

看我几分像从前

6楼-- · 2019-01-03 13:30

Note that urlgrabber does not entirely work with python 2.6. I fixed the issues (I think) by making the following modifications in keepalive.py.

In keepalive.HTTPHandler.do_open() remove this

     if r.status == 200 or not HANDLE_ERRORS:
         return r

And insert this

     if r.status == 200 or not HANDLE_ERRORS:
         # [speedplane] Must return an adinfourl object
         resp = urllib2.addinfourl(r, r.msg, req.get_full_url())
         resp.code = r.status
         resp.msg = r.reason
         return resp

0人赞添加讨论(0) 举报

疯言疯语

7楼-- · 2019-01-03 13:34

Use the urlgrabber library. This includes an HTTP handler for urllib2 that supports HTTP 1.1 and keepalive:

>>> import urllib2
>>> from urlgrabber.keepalive import HTTPHandler
>>> keepalive_handler = HTTPHandler()
>>> opener = urllib2.build_opener(keepalive_handler)
>>> urllib2.install_opener(opener)
>>> 
>>> fo = urllib2.urlopen('http://www.python.org')

Note: you should use urlgrabber version 3.9.0 or earlier, as the keepalive module has been removed in version 3.9.1

There is a port of the keepalive module to Python 3.

0人赞添加讨论(0) 举报

Python urllib2 with keep alive

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间