Converting chinese to pinyin

2019-03-09 14:11发布

I've found places on the web such as http://www.chinesetopinyin.com/ that convert Chinese characters to pinyin (romanization). Does anyone know how to do this, or have a database that can be parsed?

EDIT: I'm using C# but would actually prefer a database/flatfile.

标签: parsing cjk
3条回答
SAY GOODBYE
2楼-- · 2019-03-09 14:47

Okay, first I used my question here to get the unicode:

Converting chinese character to Unicode

Then took a file like this to convert it: http://www.ic.unicamp.br/~stolfi/voynich/Notes/061/uc-to-py.tbl

查看更多
Evening l夕情丶
3楼-- · 2019-03-09 15:10

possible solution using Python:

I think that Unicode database contains pinyin romanizations for chinese characters, but these are not included in unicodedata module data.

however, you can use some external libraries, like cjklib, example:

# coding: UTF-8
import cjklib
from cjklib.characterlookup import CharacterLookup

c = u'好'

cjk = CharacterLookup('T')
readings = cjk.getReadingForCharacter(c, 'Pinyin')
for r in readings:
    print r

output:

hāo
hǎo
hào

UPDATE

cjklib comes with an standalone cjknife utility, which micht help. some usage is described here

查看更多
我命由我不由天
4楼-- · 2019-03-09 15:10

If you use java, you can use pinyin4j.

http://pinyin4j.sourceforge.net/

查看更多
登录 后发表回答