Updated for clarity: I need advice for performance when inserting/appending to a capped collection
. I have two python scripts running:
(1) Tailing the cursor.
while WSHandler.cursor.alive:
try:
doc = WSHandler.cursor.next()
self.render(doc)
(2) Inserting like so:
def on_data(self, data): #Tweepy
if (len(data) > 5):
data = json.loads(data)
coll.insert(data) #insert into mongodb
#print(coll.count())
#print(data)
and it's running fine for a while (at 50 inserts/second). Then, after 20-60secs, it stumbles, hits the cpu roof (though it was running at 20% before), and never recovers. My mongostats take a dive (the dive is shown below).
Mongostat output:
The CPU is now choked, by the processes doing the insertion (at least according to htop
).
When I run the Tweepy lines above with print(data)
instead of adding it to db (coll.insert(data)
), everything's running along fine at 15% cpu use.
What I see in mongostats:
res
keeps climbing. (Though clogs may happen at 40m as well as run fine on 100m.)flushes
do not seem to interfere.locked %
is stable at 0.1%. Would this lead to clogging eventually?
(I'm running AWS microinstance; pymongo.)