I'm trying to program a simple web-crawler using the Requests module, and I would like to know how to disable its -default- keep-alive feauture.
I tried using:
s = requests.session()
s.config['keep_alive'] = False
However, I get an error stating that session object has no attribute 'config', I think it was changed with the new version, but i cannot seem to find how to do it in the official documentation.
The truth is when I run the crawler on a specific website, it only gets five pages at most, and then keeps looping around infinitely, so I thought it has something to do with the keep-alive feature!
PS: is Requests a good module for a web-crawler? is there something more adapted?
Thank you !
As @praveen suggested it's expected from us to use
HTTP/1.1
headerConnection: close
to notify the server that the connection should be closed after completion of the response.Here is how it's described in RFC 2616:
I am not sure but can you try passing {"Connection": "close"} as HTTP headers when sending a GET request using requests. This will close the connection as soon a server returns a response.
This works
Answered in the comments of a similar question.