Good evening, i hope you can help me with this problem, as I'm struggling to find solutions.
I have a provider of words, who give me vowelled Hebrew words , for example -
Vowelled - בַּיִת not vowelled - בית
Vowelled - הַבַּיְתָה not vowelled - הביתה
Unlike my provider, my user can't normally enter Hebrew vowels (nor should i want him to do that). The user story is the user seeking a word in the provided words. The problem is the comparison between the vowelled and the un-vowelled words. As each is represented by a different byte array in the memory, the equals method returns false.
I tried looking into how UTF-8 handles hebrew vowels and it seems like it's just normal characters.
I do want to present the vowels to the user, so i want to keep the string as-is in the memory, but when comparing i want to ignore them. Is there any simple way to solve this problem?
You can using a Collator. I can't tell you how exactly it's working as it's new to me, but this appears to do the trick:
From that, I get the following output:
AFAIK there isn't. Vowels are characters. Even some combinations of letters and dots are characters. See the wikipedia page.
http://en.wikipedia.org/wiki/Unicode_and_HTML_for_the_Hebrew_alphabet
You can store the search key for your words as characters only in the 05dx-05ex range. You can add another field for the word with the vowels.
Of course you should be expecting the following: