-->

How to detect if a string contains any Right-to-Le

2020-08-21 03:04发布

问题:

I'm trying to make a method to detect strings written in right to left languages in Java. I've come up with this question doing something similar in C#.
Now I need to have something like that but written in Java.
Any help is appreciated.

回答1:

I came up with the following code:

char[] chars = s.toCharArray();
for(char c: chars){
    if(c >= 0x600 && c <= 0x6ff){
        //Text contains RTL character
        break;
     }
}

It's not a very efficient or for that matter an accurate way but can give one ideas.



回答2:

Question is old but maybe someone else might have the same problem...

After trying several solutions I found the one that works for me:

if (Character.getDirectionality(string.charAt(0)) == Character.DIRECTIONALITY_RIGHT_TO_LEFT
    || Character.getDirectionality(string.charAt(0)) == Character.DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
    || Character.getDirectionality(string.charAt(0)) == Character.DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
    || Character.getDirectionality(string.charAt(0)) == Character.DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
    ) {

    // it is a RTL string
}


回答3:

Here's improved version of Darko's answer:

public static boolean isRtl(String string) {
    if (string == null) {
        return false;
    }

    for (int i = 0, n = string.length(); i < n; ++i) {
        byte d = Character.getDirectionality(string.charAt(i));

        switch (d) {
            case DIRECTIONALITY_RIGHT_TO_LEFT:
            case DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC:
            case DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING:
            case DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE:
                return true;

            case DIRECTIONALITY_LEFT_TO_RIGHT:
            case DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING:
            case DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE:
                return false;
        }
    }

    return false;
}

This code works for me for all of the following cases:

בוקר טוב               => true
good morning בוקר טוב  => false
בוקר טוב good morning  => true
good בוקר טוב morning  => false
בוקר good morning טוב  => true
(בוקר טוב)             => true


回答4:

Maybe this should help:

http://en.wikipedia.org/wiki/Right-to-left_mark

There should be a Unicode char, namely U+200F, when a rtl string is present.

Regards