I want to translate my Turkish strings to lowercase in both English and Turkish locale. I'm doing this:
String myString="YAŞAT BAYRI";
Locale trlocale= new Locale("tr-TR");
Locale enLocale = new Locale("en_US");
Log.v("mainlist", "en source: " +myString.toLowerCase(enLocale));
Log.v("mainlist", "tr source: " +myString.toLowerCase(trlocale));
The output is:
en source: yaşar bayri
tr source: yaşar bayri
But I want to have an output like this:
en source: yasar bayri
tr source: yaşar bayrı
Is this possible in Java?
If you are using the Locale
constructor, you can and must set the language, country and variant as separate arguments:
new Locale(language)
new Locale(language, country)
new Locale(language, country, variant)
Therefore, your test program creates locales with the language "tr-TR" and "en_US". For your test program, you can use new Locale("tr", "TR")
and new Locale("en", "US")
.
If you are using Java 1.7+, then you can also parse a language tag using Locale.forLanguageTag
:
String myString="YASAT BAYRI";
Locale trlocale= Locale.forLanguageTag("tr-TR");
Locale enLocale = Locale.forLanguageTag("en_US");
Creates strings that have the appropriate lower case for the language.
I think this is the problem:
Locale trlocale= new Locale("tr-TR");
Try this instead:
Locale trlocale= new Locale("tr", "TR");
That's the constructor to use to specify country and language.
If you just want the string in ASCII, without accents, the following might do.
First an accented character might be split in ASCII char and a combining diacritical mark (zero-width accent). Then only those accents may be removed by regular expression replace.
public static String withoutDiacritics(String s) {
// Decompose any ş into s and combining-,.
String s2 = Normalizer.normalize(s, Normalizer.Form.NFD);
return s2.replaceAll("(?s)\\p{InCombiningDiacriticalMarks}", "");
}
you can do that:
Locale trlocale= new Locale("tr","TR");
The first parameter is your language, while the other one is your country.
Characters ş
and s
are different characters. Changing locale cannot help you to translate one to another. You have to create turkish-to-english characters table and do this yourself. I once did this for Vietnamic language that has a lot of such characters. You have to deal with 4 of 5, right? So, good luck!