I'm working with raw data that has a comma as the decimal separator rather than fullstop (3,99 instead of 3.99). Is there a way to convert this directly in redshift copy command rather than having to upload then change afterwards?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
There are two issues to consider:
- Field delimiters
- Replacing Characters
The default delimiter in the Amazon Redshift COPY
command is a pipe character ( | ), unless the CSV option is used, in which case the default delimiter is a comma ( , ).
Thus, if your file is delimited by a non-comma character (eg a pip "|" symbol), then the comma in a number will not split across fields.
As to converting the comma into a decimal, this is not possible. You will need to load the field as a string, and then run an UPDATE command to copy the string into a numeric field (with a bit of character replacement too).
Alternatively, try to pre-process the file before loading into Redshift (eg through sed
), so it is clean before loading into Redshift.
标签:
amazon-redshift