I'm uploading hundreds of millions of items to my database via a REST API from a cloud server on Heroku to a database in AWS EC2. I'm using Python and I am constantly seeing the following INFO log message in the logs.
[requests.packages.urllib3.connectionpool] [INFO] Resetting dropped connection: <hostname>
This "resetting of the dropped connection" seems to take many seconds (sometimes 30+ sec) before my code continues to execute again.
- Firstly what exactly is happening here and why?
- Secondly is there a way to stop the connection from dropping so that I am able to upload data faster?
Thanks for your help.
Andrew.
Requests uses Keep-Alive
by default. Resetting dropped connection
, from my understanding, means a connection that should be alive was dropped somehow. Possible reasons are:
- Server doesn't support
Keep-Alive
.
- There's no data transfer in established connections for a while, so server drops connections.
See https://stackoverflow.com/a/25239947/2142577 for more details.
The problem is really that the server has closed the connection even though the client has requested it be kept alive.
This is not necessarily because the server doesn't support keepalives, but could be that the server is configured to only allow a certain number of requests on a connection. This could be done to help spread out requests on different servers, but I think this practice is/was common as a practical defence against poorly written code that operates in the server (eg. PHP) that doesn't clean up after itself after serving a request (perhaps due to an error condition etc.)
If you think this is the case for you and you'd like to not see these logs (which are logged at INFO level), then you can add the following to quieten that part of the logging:
# Really don't need to hear about connections being brought up again after server has closed it
logging.getLogger("requests.packages.urllib3.connectionpool").setLevel(logging.WARNING)
This is common practice for services that expose RESTful APIs to avoid abuse (or DoS).
If you're stressing their API they'll drop your connection.
Try getting your script to sleep a bit every once in a while to avoid the drop.