how to display chinese character properly in sqlit

2019-04-13 02:15发布

问题:

Here is the sample csv file in utf-8 format which can be opened in win7's notepad and the chinese character displayed properly ,please download it .
http://pan.baidu.com/s/1sj0ia4H

Open your cmd ,and set chcp 650001.

C:\Users\pengsir>sqlite3  e:\\test.db   
SQLite version 3.8.4.3 2014-04-03 16:53:12  
Enter ".help" for usage hints.
sqlite> create table ipo(name TEXT,method TEXT);
sqlite> .separator ","
sqlite> .import  "e:\\tmp.csv"  ipo
sqlite> select * from ipo;
000001,公开招募
000002,申请表抽签é™é¢è®¤è´­
000004,定å‘å‘è¡Œ
000005,银行储蓄存å•æ–¹å¼
000006,申请表抽签é™é¢è®¤è´­
000007,自办å‘è¡Œ
000008,自办å‘è¡Œ
000009,定å‘å‘è¡Œ
000010,定å‘å‘è¡Œ
000011,申请表抽签等é¢è®¤è´­
sqlite>

why the same sqlite command can get proper display in sqlitemanager?
and how can i set to display chinese character in sqlite console?

In pysqlite3 , it can get right display in python console.

>>> import sqlite3  
>>> con=sqlite3.connect("e:\\test.db")   
>>> cur=con.cursor()   
>>> cur.execute("select * from ipo;")  
<sqlite3.Cursor object at 0x01751720>  
>>> print(cur.fetchall())   
[('000001', '公开招募'), ('000002', '申请表抽签限额认购'), ('000004', '定向发行'   
), ('000005', '银行储蓄存单方式'), ('000006', '申请表抽签限额认购'), ('000007',   
'自办发行'), ('000008', '自办发行'), ('000009', '定向发行'), ('000010', '定向发   
行'), ('000011', '申请表抽签等额认购')]   
>>>   

回答1:

This issue concers how Command Prompt window shows the characters, and is not about how sqlite3 prints the output;

As a simple demonstration here we absolutely exclude sqlite3 and look at the files by the type command:

Let's see whats happen in other different O.S., for example in OSX: ISO-8859-1 correspond to (Windows latino 1), windows equivalent code page setting: chcp 819 UTF8 correspond to Unicode (UTF-8), windows equivalent code page setting: chcp 65001

Pretty the same behavior also happens in Windows: use command chcp to inspect and/or setting-up your current code page

NOTICE: this is a screenshot of an Italian Windows XP and as you can see there is still no luck! :-( , in this case the cause consists in a leak of available fonts configurable in command prompt properties in my "Windows XP" box:

I hope this is not the case of your "Windows Seven" box ( ..but if it is , please leave me a comment to be a more specific in this part of the answer ). ..when the problem switches to the "fonts available" then Additional Languages supports would be installed and still need forcing UTF-8 by a chcp 65001:

How to get proper fonts

follows the list of steps I followed to get the result on ITA WinXP SP2 as shown in the above screenshot:

Step 1 Install East Asian language files on your computer

lecture link: to install East Asian language files on your computer

In summary these two options have been both checked and in "Advanced Tab" I've selected Chinese:

Step 2 Switch from raster to chinese font in the terminal/"Command Windows"

Extra Step 3 (Optional) Check font in notepad

Notepad can be useful for some inspections on fonts, for example open the temp.csv and play with fonts but be aware of: Necessary criteria for fonts to be available in a command window



回答2:

Well the obvious problem is that Windows (pretty much in general) has a problem in dealing with UTF-8. Especially the command line tool is by default set to a country specific codepage rather than unicode.

Usually you can (temporarily) fix it by setting the codepage for the command-line session to utf-8, for example by typing:

chcp 65001

But the problem is that in your case this does not really fix it, since sqlite seems to still run with the default charset, and there does not seem to be any option to set the current sqlite3 session to unicode.

Still the good news above it all is, that your data is correct, and you can work with it correctly using sqlitemanager or similar tools, which are able to handle unicode appropriately.

To further substantiate this: If you open your original csv with Excel it probably also will give you messed up characters (since it usually does not default to unicode). Whereas LibreOffice will typically ask you for the encoding to use, and given unicode will show the correct text, but given a different encoding (eg: western europe, etc.) will give you the same result as excel (you can preview it there quite nicely, give it a shot).

Hope this helps!



标签: sqlite3