I need to intercept an emoticon entry and change for my own emoticon.
When I intercept an emoticon, for example, the FACE WITH MEDICAL MASK (\U+1F604), I get an UTF-16 char (0xD83D 0xDE04), Is it possible to convert this char value to the unicode value?
I need to convert 0xD83D 0xDE04 to \u1f604.
Thanks,
I get an UTF-16 char (0xD83D 0xDE04), Is it possible to convert this char value to the unicode value?
For just a single code point in a string, you can convert it to an integer with:
int codepoint = "\uD83D\uDE04".codePointAt(0); // 0x1F604
It is, however quite tedious to go over a whole string with codePointCount
/codePointAt
. Java/Dalvik's String type is strongly tied to UTF-16 code units and the codePoint methods are a poorly-integrated afterthought. If you are simply hoping to replace an emoji with some other string of characters, you are probably best off doing a plain string replace or regex with the two code units as they appear in the String type, eg text.replace("\uD83D\uDE04", ":-D")
.
(BTW Face with medical mask is U+1F637.)
\u1f604
is the UTF-32 encoding of that emoticon. You can convert this way:
byte[] bytes = "\uD83D\uDE37".getBytes("UTF-32BE");