How can I determine what the alphabet for a locale

I would like to determine what the alphabet for a given locale is, preferably based on the browser Accept-Language header values. Anyone know how to do this, using a library if necessary ?

标签： java locale character-encoding

5条回答

Ridiculous、

2楼-- · 2019-01-23 20:14

take a look at [LocaleData.getExemplarSet][1]

for example for english this returns abcdefghijklmnopqrstuvwxyz

[1]: http://icu-project.org/apiref/icu4j/com/ibm/icu/util/LocaleData.html#getExemplarSet(com.ibm.icu.util.ULocale, int)

0人赞添加讨论(0) 举报

淡お忘

3楼-- · 2019-01-23 20:22

If you just want to know the name of an appropriate character set for a users locale then you might try the nio.CharSet class.

If you really want to use the Accept-Language header, then there's an old O'Reilly article on this matter which introduces a pretty handy class called LanguageNegotiator.

I think one of those will give you a decent enough start.

0人赞添加讨论(0) 举报

趁早两清

4楼-- · 2019-01-23 20:27

This is an English answer written in Århus. Yesterday, I heard some Germans say 'Blödheit, à propos, ist dumm'. However, one of them wore a shirt that said 'I know the difference between 文字 and الْعَرَبيّة'.

What's the answer to your question for this text? Is it allowed? Isn't this an English text?

0人赞添加讨论(0) 举报

别忘想泡老子

5楼-- · 2019-01-23 20:30

The International Components for Unicode might help here. Specifically the UScript class looks promising.

Out of curiosity: What do you need it for?

0人赞添加讨论(0) 举报

来，给爷笑一个

6楼-- · 2019-01-23 20:32

It depends on how specific you want to get. One place to look would be at the "Suppress-Script" properties in the IANA language registry.

Some languages have multiple "alphabets" that can be used for writing. For example, Azerbaijani can be written in Latin or Arabic script. Most languages, like English, are written almost exclusively in a single script, so the correct script goes without saying, and should be "suppressed" in language codes.

So, looking at the entry for Russian, you can tell that the preferred script is Cyrillic, while for Ethiopian, it is Amharic. But German, Norwegian, and English aren't more specific than "Latin". So, with this method, you'd have a hard time hiding umlauts and thorns from Americans, or offering any script to a Kashmiri writer.

0人赞添加讨论(0) 举报

How can I determine what the alphabet for a locale

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间