- I have a .csv table (t1) with columns: c1, c2, c3 in amazon S3 storage
- I want to copy that into amazon redshift
- I create the table with columns: c1, c2, c3 where all columns are nullable
I copy with command:
copy t1a (c1,c3)
from t1
I expected it would copy c1 and c3 over from t1 and place the default null value in c2 so a row in t1a might look like (c1_rowX, null, c3_rowX).
Instead I get a type error because it's coping c2 (string type) data from t1 into c3 (int type) of t1a.
the copy command works fine when I don't specify the columns:
copy t1a
from t1
I've included a link to the redshift copy command documentation:
http://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html
- main question is there an issue with my use of specifying columns. thanks
If you want to skip preprocessing part, you can define the column to be skipped as CHAR(1)
and then use a TRUNCATECOLUMNS
parameter to a COPY
command:
CREATE TABLE t1a (
c1,
c2 CHAR(1),
c3
);
COPY t1a FROM t1 TRUNCATECOLUMNS
The TRUNCATECOLUMNS
ignores all data that is longer than defined in a table schema during the import, so all data in that column will be truncated into 1 character.
That's just a hack, preprocessing input file is recommended, but sometimes a hack is all that's needed.
Detailed column-mappings are not supported in COPY
directly, unless Amazon/ParAccel have added something extra to their PostgreSQL fork that isn't in mainline.
What you typically do in this case is have a script massage/pad the data - read the CSV, transform it as desired, then send the line through the PostgreSQL connection to a COPY ... FROM STDIN
command.
Alternately, you might COPY
the whole original CSV into a table then transform that data with an INSERT INTO ... SELECT
to the real target table.