I'm pulling data from Amazon S3 into a table in Amazon Redshift. The table contains various columns, where some column data might contain special characters.
The copy command has an option called Delimiter
where we can specify the delimiter while pulling the data into the table.
The issue is 2 fold -
When I export (unload command)
to S3 using a delimiter - say ,
- it works fine, but when I try to import into Redshift from S3, the issue creeps in because certain columns contain the ',' operator which the copy command misinterprets as delimiter and throws error.
I tried various delimiters, but the data in my table seems to contain some or other kind of special character which causes the above issue.
I even tried unloading using multiple delimiter - like #%
or ~,
but when loading from s3 using copy command - the dual delimiter is not supported.
Any solutions?
I think the delimiter can be escaped using \
but for some reason that isn't working either, or maybe I'm not using the right syntax for escaping in copy command.
If the delimiter is appearing within fields, then use the
ADDQUOTES
parameter with theUNLOAD
command:Then:
A popular delimiter is the pipe character (
|
) that is rare in text files.The following example shows the contents of a text file with the field values separated by commas.
If you load the file using the DELIMITER parameter to specify comma-delimited input, the COPY command will fail because some input fields contain commas. You can avoid that problem by using the CSV parameter and enclosing the fields that contain commas in quote characters. If the quote character appears within a quoted string, you need to escape it by doubling the quote character. The default quote character is a double quotation mark, so you will need to escape each double quotation mark with an additional double quotation mark. Your new input file will look something like this.
Source :- Load Quote from a CSV File
What I use -
If I’ve made a bad assumption please comment and I’ll refocus my answer.