Importing zipped CSV file into PostgreSQL

2020-02-05 05:29发布

问题:

I have a big compressed csv file (25gb) and I want to import it into PostgreSQL 9.5 version. Is there any fast way to import zip or qzip file into postgres without extracting the file?

回答1:

There is an old trick to use a named pipe (works on Unix, don't know about Windows)

  • create a named pipe: mkfifo /tmp/omyfifo
  • write the file contents to it: zcat mycsv.csv.z > /tmp/omyfifo &
  • [from psql] copy mytable(col1,...) from '/tmp/omyfifo'
  • [when finished] : rm /tmp/omyfifo

The zcat in the backgound will block until a reader (here: the COPY command) will start reading, and it will finish at EOF. (or if the reader closes the pipe)

You could even start multiple pipes+zcat pairs, which will be picked up by multiple COPY statements in your sql script.


This will work from pgadmin, but the fifo (+zcat process) should be present on the machine where the DBMS server runs.


BTW: a similar trick using netcat can be used to read a file from a remote machine (which of course should write the file to the network socket)



回答2:

example how to do it with zcat and pipe:

-bash-4.2$ psql -p 5555 t -c "copy tp to '/tmp/tp.csv';"
COPY 1
-bash-4.2$ gzip /tmp/tp.csv
-bash-4.2$ zcat /tmp/tp.csv.gz | psql -p 5555 t -c "copy tp from stdin;"
COPY 1
-bash-4.2$ psql -p 5555 t -c "select count(*) from tp"
 count
-------
     2
(1 row)

also from 9.3 release you can:

psql -p 5555 t -c "copy tp from program 'zcat /tmp/tp.csv.gz';"

without pipe at all