I have some documents which I have to fetch from mongodb and set it to memcache. Here is the code
import memcache
from pymongo import MongoClient
db = mongo_client.job_db.JobParsedData
jobs = db.find().sort("JobId", 1)
def set_to_memcache_raw(jobs):
print("Setting raw message to memcache")
count = 0
for item in jobs:
job_id = item.get('JobId')
job_details = item.get('JobDetails')
if job_id.strip():
count += 1
memcache_obj.set(job_id, job_details, time=72000)
if count % 1000 == 0:
print("Inserted {} keys in memcache".format(count))
if count >= 1000000:
break
But, after some odd number of iterations the code throw this error -
Traceback (most recent call last):
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/pool.py", line 450, in receive_message
self.sock, operation, request_id, self.max_message_size)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/network.py", line 137, in receive_message
header = _receive_data_on_socket(sock, 16)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/network.py", line 164, in _receive_data_on_socket
chunk = sock.recv(length)
ConnectionResetError: [Errno 104] Connection reset by peer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "memcache-poc.py", line 56, in <module>
elapsed = time.time() - t0
File "memcache-poc.py", line 52, in main
jobs = db.find(query)
File "memcache-poc.py", line 17, in set_to_memcache_raw
print("Setting raw message to memcache")
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/cursor.py", line 1114, in next
if len(self.__data) or self._refresh():
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/cursor.py", line 1056, in _refresh
self.__max_await_time_ms))
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/cursor.py", line 873, in __send_message
**kwargs)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/mongo_client.py", line 905, in _send_message_with_response
exhaust)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/mongo_client.py", line 916, in _reset_on_error
return func(*args, **kwargs)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/server.py", line 136, in send_message_with_response
response_data = sock_info.receive_message(1, request_id)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/pool.py", line 452, in receive_message
self._raise_connection_failure(error)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/pool.py", line 550, in _raise_connection_failure
_raise_connection_failure(self.address, error)
File "/home/dimension/.virtualenvs/docparser/lib/python3.5/site-packages/pymongo/pool.py", line 211, in _raise_connection_failure
raise AutoReconnect(msg)
pymongo.errors.AutoReconnect: xxx.xxx.xxx.xxx:27017: [Errno 104] Connection reset by peer
I have gone through links such as
pymongo-errors
mongodb-TCP keep-alive
why-does-pymongo-throw-autoreconnect
Their is no question of socket inactivity in the above code as my jobs object is an iterator and every time next() is called on this object it will fetch the next document (from mongo itself)
I have mongodb installed on Azure cloud and my TCP keep alive for the instance is 7200 seconds. I get this figure by firing this command
sysctl net.ipv4.tcp_keepalive_time
7200
Do having a try cacth block over the for loop helps in this case