How to set TCP_NODELAY flag when loading URL with

2020-04-02 08:05发布

问题:

I am using urllib2 for loading web-page, my code is:

httpRequest = urllib2.Request("http:/www....com")
pageContent = urllib2.urlopen(httpRequest)
pageContent.readline()

How can I get hold of the socket properties to set TCP_NODELAY?

In normal socket I would be using function:

socket.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)

回答1:

If you need to access to such low level property on the socket used, you'll have to overload some objects.

First, you'll need to create a subclass of HTTPHandler, that in the standard library do :

class HTTPHandler(AbstractHTTPHandler):

    def http_open(self, req):
        return self.do_open(httplib.HTTPConnection, req)

    http_request = AbstractHTTPHandler.do_request_

As you can see, it uses a HTTPConnection to open connection... You'll have to override it too ;) to upgrade the connect() method.

Something like this should be a good start :

class LowLevelHTTPConnection(httplib.HTTPConnection):

    def connect(self):
        httplib.HTTPConnection.connect(self)
        self.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)


class LowLevelHTTPHandler(HTTPHandler):

    def http_open(self, req):
        return self.do_open(LowLevelHTTPConnection, req)

urllib2 is smart enough to allow you to subclass some handler and then use it, the urllib2.build_opener is made for this :

urllib2.install_opener(urllib2.build_opener(LowLevelHTTPHandler)) # tell urllib2 to use your HTTPHandler in replacement of the standard HTTPHandler
httpRequest = urllib2.Request("http:/www....com")
pageContent = urllib2.urlopen(httpRequest)
pageContent.readline()


回答2:

For requests, the classes seem to be in request.packages.urllib3; there are 2 classes, HTTPConnection, and HTTPSConnection. They should be monkeypatchable in place at the module top level:

from requests.packages.urllib3 import connectionpool

_HTTPConnection = connectionpool.HTTPConnection
_HTTPSConnection = connectionpool.HTTPSConnection

class HTTPConnection(_HTTPConnection):
    def connect(self):
        _HTTPConnection.connect(self)
        self.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)

class HTTPSConnection(_HTTPSConnection):
    def connect(self):
        _HTTPSConnection.connect(self)
        self.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)

connectionpool.HTTPConnection = HTTPConnection
connectionpool.HTTPSConnection = HTTPSConnection


回答3:

Do you have to use urllib2?

Alternatively, you can use httplib2, which has the TCP_NODELAY option set.

https://code.google.com/p/httplib2/

It adds a dependency to your project, but seems less brittle than monkey patching.