Java case insensitive localized ordering

2019-05-15 00:37发布

问题:

I have set of hyphenated string sets. That I want to sort considering the locale.

List<String> words = Arrays.asList("App - Small", "Apple", "App - Big");

Collator collator = Collator.getInstance(new Locale("en"));

// Sort Method 1            
Collections.sort(words, String.CASE_INSENSITIVE_ORDER);
System.out.println(words.toString());

// Sort Method 2        
collator.setStrength(Collator.PRIMARY);
Collections.sort(words, collator);
System.out.println(words.toString());

Result

String.CASE_INSENSITIVE_ORDER

[App - Big, App - Small, Apple]  

Collator.PRIMARY

[App - Big, Apple, App - Small]

Though the Collator.PRIMARY is supposed to do a case-insensitive sorting there is difference between the order using the above two methods. How can I achieve locale based case-insensitive sort order that works with hyphen.

[App - Big, App - Small, Apple] - Expected sort order

回答1:

Below is the quote from the API.

It does not only depend on the strength but also other things. Enclose hyphen('-') between single quotes and you will get the desired output ''

The definitions of the rule elements is as follows:

  • Text-Argument: A text-argument is any sequence of characters, excluding special characters (that is, common whitespace characters [0009-000D, 0020] and rule syntax characters [0021-002F, 003A-0040, 005B-0060, 007B-007E]). If those characters are desired, you can put them in single quotes (e.g. ampersand => '&'). Note that unquoted white space characters are ignored; e.g. b c is treated as bc.

http://docs.oracle.com/javase/7/docs/api/java/text/RuleBasedCollator.html#compare(java.lang.String, java.lang.String)



回答2:

There is no case sensitivity issue involved. The collator ignores spaces and hyphens so, since all strings start with “App”, the significant letters are “S”, “l”, and “B” in your example and the resulting order “B” “L” “S” is correct.