Redshift COPY command delimiter not found

2019-03-17 09:37发布

问题:

I'm trying to load some text files to Redshift. They are tab delimited, except for after the final row value. That's causing a delimiter not found error. I only see a way to set the field delimiter in the COPY statement, not a way to set a row delimiter. Any ideas that don't involve processing all my files to add a tab to the end of each row?

Thanks

回答1:

I don't think the problem is with missing <tab> at the end of lines. Are you sure that ALL lines have correct number of fields?

Run the query:

select le.starttime, d.query, d.line_number, d.colname, d.value,
le.raw_line, le.err_reason    
from stl_loaderror_detail d, stl_load_errors le
where d.query = le.query
order by le.starttime desc
limit 100

to get the full error report. It will show the filename with errors, incorrect line number, and error details.

This will help to find where the problem lies.



回答2:

You can get the delimiter not found error if your row has less columns than expected. Some CSV generators may just output a single quote at the end if last columns are null.

To solve this you can use FILLRECORD on Redshift copy options.



回答3:

I know this was answered, but I just dealt with the same error and I had a simple solution so i'll share it.

This error can also be solved by stating the specific columns of the table that are copied from the s3 files (if you know what are the columns in the data on s3). In my case the data had less columns than the number of columns in the table. Madahava's answer with the 'FILLRECORD' option DID solve the issue for me but then I noticed a column that was supposed to filled up with default values, remained null.

COPY <table> (col1, col2, col3) from 's3://somebucket/file' ...


回答4:

This may not be directly related to the OP's question but I received the same Delimiter not found error which was caused by newline characters within one of the fields.

For any field that you think may have newline characters you can remove them with:

replace(my_field, chr(10), '')