I need to do something rather simple but without hash mapping hard coding.
I have a String s and it is in Cyrillic I need some sort of example on how to turn it into Latin characters using a custom filter of a sort (to give a purely Latin example as to not confuse anyone if String s = sniff; I want it to look up s-n-i-f-f and change them into something else (there might also be combinations).
I can see that ICU4j can do this sort of thing but I have no idea how to achieve it as I can't find any working examples (or I'm just stupid).
Any help is appreciated.
Thanks
Best Regards,
PS I need a batch translate. I don't care about styles or dynamic transliteration just some basic example on what a ICU4j batch transliterator would look like.
K I actually got it.
import com.ibm.icu.text.Transliterator;
public class BulgarianToLatin {
public static String BULGARIAN_TO_LATIN = "Bulgarian-Latin/BGN";
public static void main(String[] args) {
String bgString = "Джокович";
Transliterator bulgarianToLatin = Transliterator.getInstance(BULGARIAN_TO_LATIN);
String result1 = bulgarianToLatin.transliterate(bgString);
System.out.println("Bulgarian to Latin:" + result1);
}
}
Also one last edit for a rule based transliteration ( if you do not wish to use the pre-existing once or just want something custom made )
import com.ibm.icu.text.Transliterator;
public class BulgarianToLatin {
public static String BULGARIAN_TO_LATIN = "Bulgarian-Latin/BGN";
public static void main(String[] args) {
String bgString = "а б в г д е ж з и й к л м н о п р с т у ф х ц ч ш щ ю я \n Юлиян Джокович";
String rules="::[А-ЪЬЮ-ъьюяѢѣѪѫ];" +
"Б > B;" +
"б > b;" +
"В > V;" +
"ТС > TS;" +
"Тс > Ts;" +
"ч > ch;" +
"ШТ > SHT;" +
"Шт > Sht;" +
"шт > sht;" +
"{Ш}[[б-джзй-нп-тф-щь][аеиоуъюяѣѫ]] > Sh;" +
"Я > YA;" +
"я > ya;";
Transliterator bulgarianToLatin = Transliterator.createFromRules("temp", rules, Transliterator.FORWARD);
String result1 = bulgarianToLatin.transliterate(bgString);
System.out.println("Bulgarian to Latin:" + result1);
}
}