Hiragana to Kanji converter

2019-03-27 15:00发布

问题:

do you know if there is a library in C# or a dictionary that could help me to translate Hiragana to Kanji? I know that there is the IME of Windows but I would like to customize entirely the design of the candidate list of Kanji for a given Hiragana and it is not possible with this IME.

Exemple : the user writes "toru", first it is translated in Hiragana : "とる" I would like to have this list of choice:

撮る 取る 盗る

Thanks!

回答1:

Unfortunatelly I do not know of a c# library. All I found involves importing some native libraries, like in this OS thread: Japanese to Romaji with Kakasi

If you are willing to do so, perhaps JWPce might help.

Although this is implemented as a Japanese text editor, it also contains a dictionary function (it actually contains a multitude of character lookup systems) that do what you want to do.

Possibly you can compile the project and then import those lookup functionality? JPWce is licensed under GPL and you can download both a binary executable and source code directly available from the homepage.

[Edit]

Researching some more I stumbled over mozc at Google Code:

Mozc is a Japanese Input Method Editor (IME) designed for multi-platform such as Chromium OS, Windows, Mac and Linux. This open-source project originates from Google Japanese Input.

(BSD license)

I have not looked into it myself yet, but it might be more what you are looking for as it does not have a full application "around it" but instead is intended to be used a library. Just like you wanted.

They also link to a short video how the input looks like: http://www.google.co.jp/ime/

Unfortunatelly, this still is C++, not .NET but it might be a starting point.



回答2:

Microsoft publishes this as a separate product, called Visual Studio International Pack

http://visualstudiogallery.msdn.microsoft.com/74609641-70BD-4A18-8550-97441850A7A8



回答3:

I do not know a C# library either. But given that a dictionary might be sufficient, you may want to look into using the IME dictionary that comes with Anthy.

If you download the sources of the most recent version, you'll find dictionary sources in the mkworddic and alt-cannadic directories. Look at the various files ending in .t.

Note that they are encoded in EUC-JP; you might want to convert them to UTF-8.