I have a large CSV file that I am going to load it into a MySQL table. However, these data are encoded into utf-8 format, because they include some non-english characters. I have already set the character set of the corresponding column in the table to utf-8. But when I load my file. the non-english characters turn into weird characters(when I do a select on my table rows). Do I need to encode my data before I load the into the table? if yes how Can I do this. I am using Python to load the data and using LOAD DATA LOCAL INFILE command. thanks
问题:
回答1:
as said in http://dev.mysql.com/doc/refman/5.1/en/load-data.html, you can specify the charset used by your CSV file with the "CHARACTER SET" optional parameter of LOAD DATA LOCAL INFILE
回答2:
Try
LOAD DATA INFILE 'file'
IGNORE INTO TABLE table
CHARACTER SET UTF8
FIELDS TERMINATED BY ';'
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
回答3:
Do not need encode your characters in the file, but you need to make sure that your file is encoding at UTF-8 before load this file to database.
回答4:
You should send
init_command = 'SET NAMES UTF8'
use_unicode = True
charset = 'utf8'
when doing MySQLdb.connect() e.g.
dbconfig = {}
dbconfig['host'] = 'localhost'
dbconfig['user'] = ''
dbconfig['passwd'] = ''
dbconfig['db'] = ''
dbconfig['init_command'] = 'SET NAMES UTF8'
dbconfig['use_unicode'] = True
dbconfig['charset'] = 'utf8'
conn = MySQLdb.connect(**dbconfig)
edit: ah, sorry, I see you've added that you're using "LOAD DATA LOCAL INFILE" -- this wasn't clear from your initial question :)
回答5:
Try something like,
LOAD DATA LOCAL INFILE "file" INTO TABLE message_history CHARACTER SET UTF8 COLUMNS TERMINATED BY '|' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '"';
Original Structure,
https://dev.mysql.com/doc/refman/8.0/en/load-data.html