I have quite of bit of data that I will be uploading into Google App Engine. I want to use the bulkloader to help get it in there. However, I have so much data that I generally use up my CPU quota before it's done. Also, any other problem such a bad internet connection or random computer issue can stop the process.
Is there any way to continue a bulkload from where you left off? Or to only bulkload data that has not been written to the datastore?
I couldn't find anything in the docs, so I assume any answer will include digging into the code.
Well, it is in the docs:
If the transfer is interrupted, you
can resume the transfer from where it
left off using the --db_filename=...
argument. The value is the name of the
progress file created by the tool,
which is either a name you provided
with the --db_filename argument when
you started the transfer, or a default
name that includes a timestamp. This
assumes you have sqlite3 installed,
and did not disable the progress file
with --db_filename=skip.
http://code.google.com/appengine/docs/python/tools/uploadingdata.html
(I've used it some time ago, so I had a feeling it should be there)