I wrote Python script that processes big number of large text files and may run a lot of time. Sometimes, there is a need to stop the running script and to resume it later. The possible reasons to stop the script are program crash, disk 'out of space' situation or many others when you have to do it. I want to implement kind of "stop/resume" mechanism for the script.
- On stop: the script quits & saves its current state.
- On resume: the script starts, but continues from the latest saved state
I'm going to implement it using the pickle and the signal modules.
I'll be glad to hear how to do it in pythonic way.
Thank you!
The execution could sleep it's life away, or (aside from the exceptions of security), the state of the script can be
pickle
d, zipped, and stored.http://docs.python.org/library/pickle.html
http://docs.python.org/library/marshal.html
http://docs.python.org/library/stdtypes.html (5.9)
http://docs.python.org/library/archiving.html
http://www.henrysmac.org/?p=531
Here is something simple that hopefully can help you:
Example of running:
If you are looking to read big files, just use a file handle, and read the lines one at a time, processing each line as you need to. If you'd like to save the python session, then just use
dill.dump_session
-- and it will save all existing objects. Other answers will fail aspickle
cannot pickle a file handle.dill
, however, can serialize almost every python object -- including a file handle.Then quit the python session, and restart. When you
load_session
, you load all the objects that existed at the time of thedump_session
call.Simple as that.
Get
dill
here: https://github.com/uqfoundation