I'm writing a script to do a copy of some data between two machines on the same network using psycopg2. I'm replacing some old, ugly bash that does the copy with
psql -c -h remote.host "COPY table TO STDOUT" | psql -c "COPY table FROM STDIN"
This seems like both the simplest and most efficient way to do the copy. It's easy to replicate in python with a stringIO or a temp-file, like so:
buf = StringIO()
from_curs = from_conn.cursor()
to_curs = to_conn.cursor()
from_curs.copy_expert("COPY table TO STDOUT", buf)
buf.seek(0, os.SEEK_SET)
to_curs.copy_expert("COPY table FROM STDIN", buf)
...but that involves saving all the data to disk/in memory.
Has anyone figured out a way to mimic the behavior of a Unix pipe in a copy like this? I can't seem to find a unix-pipe object that doesn't involve POpen - Maybe the best solution is to just use POpen and subprocess, after all.
You could use a deque that you've subclassed to support reading and writing:
If the reader is much faster than the writer, and the table is large, the
deque
will still get big, but it will be smaller than storing the whole thing.Also, I don't know for sure
return ''
when thedeque
is empty is safe, rather than retrying until it's not empty, but I'd guess it is. Let me know if it works.Remember to
del buf
when you're sure the copy is done, especially if the script isn't just exiting at that point.You will have to put one of your calls in a separate thread. I just realized you can use os.pipe(), which makes the rest quite straightforward: