I have several script running on a server which pickle and unpickle various dictionaries. They all use the same basic code for pickling as shown below:
SellerDict=open('/home/hostadl/SellerDictkm','rb')
SellerDictionarykm=pickle.load(SellerDict)
SellerDict.close()
SellerDict=open('/home/hostadl/SellerDictkm','wb')
pickle.dump(SellerDictionarykm,SellerDict)
SellerDict.close()
All the scripts run fine except for one of them. The one that has issues goes to various websites and scrapes data and stores it in dictionary. This code runs all day long pickling and unpickling dictionaries and stops at midnight. A cronjob then starts it again the next morning. This script can run for weeks without having a problem but about once a month the script dies due to a EOFError when it tries to open a dictionary. The size of the dictionaries are usually about 80 MB. I even tried adding SellerDict.flush() before SellerDict.close() when pickling the data to make sure evening was being flushed.
Any idea's what could be causing this? Python is pretty solid so I don't think it is due to the size of the file. Where the code runs fine for a long time before dying it leads me to believe that maybe something is being saved in the dictionary that is causing this issue but I have no idea.
Also, if you know of a better way to be saving dictionaries other than pickle I am open to options. Like I said earlier, the dictionaries are constantly being opened and closed. Just for clarification, only one program will use the same dictionary so the issue is not being caused by several programs trying to access the same dictionary.
UPDATE:
Here is the traceback that I have from a log file.
Traceback (most recent call last):
File "/home/hostadl/CompileRecentPosts.py", line 782, in <module>
main()
File "/home/hostadl/CompileRecentPosts.py", line 585, in main
SellerDictionarykm=pickle.load(SellerDict)
EOFError
So this actually turned out to be a memory issue. When the computer would run out of RAM and try to unpickle or load the data, the process would fail claiming this EOFError. I increase the RAM on the computer and this never was an issue again.
Thanks for all the comments and help.
Here's what happens when you don't use locking:
Here's the output:
Eventually, some read process is going to try to read the pickled file while a write process already has it open to write. You need to make sure that you have some way to tell whether another process already has a file open for writing before you try to read from it.
For a very simple solution, have a look at this thread that discusses using Filelock.