Persistent HTTPS Connections in Python

2019-09-04 12:08发布

问题:

I want to make an HTTPS request to a real-time stream and keep the connection open so that I can keep reading content from it and processing it.

I want to write the script in python. I am unsure how to keep the connection open in my script. I have tested the endpoint with curl which keeps the connection open successfully. But how do I do it in Python. Currently, I have the following code:

c = httplib.HTTPSConnection('userstream.twitter.com')
c.request("GET", "/2/user.json?" + req.to_postdata())
response = c.getresponse()

Where do I go from here?

Thanks!

回答1:

It looks like your real-time stream is delivered as one endless HTTP GET response, yes? If so, you could just use python's built-in urllib2.urlopen(). It returns a file-like object, from which you can read as much as you want until the server hangs up on you.

f=urllib2.urlopen('https://encrypted.google.com/')
while True:
    data = f.read(100)
    print(data)

Keep in mind that although urllib2 speaks https, it doesn't validate server certificates, so you might want to try and add-on package like pycurl or urlgrabber for better security. (I'm not sure if urlgrabber supports https.)



回答2:

Connection keep-alive features are not available in any of the python standard libraries for https. The most mature option is probably urllib3



回答3:

httplib2 supports this. (I'd have thought this the most mature option, didn't know urllib3 yet, so TokenMacGuy may still be right)

EDIT: while httplib2 does support persistent connections, I don't think you can really consume streams with it (ie. one long response vs. multiple requests over the same connection), which I now realise you may need.