invalid byte sequence for encoding “UTF8”

2019-01-03 22:40发布

I'm trying to import some data into my database. So I've created a temporary table,

create temporary table tmp(pc varchar(10), lat decimal(18,12), lon decimal(18,12), city varchar(100), prov varchar(2));

And now I'm trying to import the data,

 copy tmp from '/home/mark/Desktop/Canada.csv' delimiter ',' csv

But then I get the error,

ERROR:  invalid byte sequence for encoding "UTF8": 0xc92c

How do I fix that? Do I need to change the encoding of my entire database (if so, how?) or can I change just the encoding of my tmp table? Or should I attempt to change the encoding of the file?

16条回答
疯言疯语
2楼-- · 2019-01-03 23:35

You can replace the backslash character with, for example a pipe character, with sed.

sed -i -- 's/\\/|/g' filename.txt
查看更多
别忘想泡老子
3楼-- · 2019-01-03 23:35

follow the below steps to solve this issue in pgadmin:

  1. SET client_encoding = 'ISO_8859_5';

  2. COPY tablename(column names) FROM 'D:/DB_BAK/csvfilename.csv' WITH DELIMITER ',' CSV ;

查看更多
冷血范
4楼-- · 2019-01-03 23:37

For python, you need to use

Class pg8000.types.Bytea (str) Bytea is a str-derived class that is mapped to a PostgreSQL byte array.

or

Pg8000.Binary (value) Construct an object holding binary data.

查看更多
一纸荒年 Trace。
5楼-- · 2019-01-03 23:38

I got the same error when I was trying to copy a csv generated by Excel to a Postgres table (all on a Mac). This is how I resolved it:

1) Open the File in Atom (the IDE that I use)

2) Make an insignificant change in the file. Save the file. Undo the change. Save again.

Presto! Copy command worked now.

(I think Atom saved it in a format which worked)

查看更多
登录 后发表回答