Easiest way of checking if a string consists of un-第2页回答

I need to check in Java if a word consists of unique letters (case insensitive). As straight solution is boring, I came up with:

For every char in a string check if indexOf(char) == lastIndexOf(char).
Add all chars to HashSet and check if set size == string length.
Convert a string to a char array, sort it alphabetically, loop through array elements and check if c[i] == c[i+1].

Currently I like #2 the most, seems like the easiest way. Any other interesting solutions?

标签： java algorithm string

12条回答

做个烂人

2楼-- · 2020-03-01 18:37

Here's the code I wrote for Kache's answer(referred from cracking the code and modified):

public boolean check() {
    int[] checker = new int[8];
    String inp = "!a~AbBC#~";
    boolean flag = true;
    if (inp.length() > 256)
        flag = false;
    else {
        for(int i=0;i<inp.length();i++) {
            int x = inp.charAt(i);
            int index = x/32;
            x = x%32;
            if((checker[index] & (1<<x)) > 0) { 
                flag = false;
                break;
            }
            else
                checker[index] = checker[index] | 1<<x;
        }
    }
    return flag;
}

0人赞添加讨论(0) 举报

Fickle 薄情

3楼-- · 2020-03-01 18:39

public boolean hasUniqChars(String s){
  Hashset h = new Hashset();
  HashSet<Character> h = new HashSet<Character>();
  for (char c : s.toCharArray()) {
  if (!h.add(Character.toUpperCase(c))) // break if already present
    return false;
  }
  return true;
}

You should use hashset technique if you are performing char sets like utf-8 and for internationalization's sake.

Javadoc on Character.toUpperCase for cases of utf: This method (toUpperCase(char) ) cannot handle supplementary characters. To support all Unicode characters, including supplementary characters, use the toUpperCase(int) method.

0人赞添加讨论(0) 举报

乱世女痞

4楼-- · 2020-03-01 18:42

You can optimize the first solution(indexof == lastindexof) just by checking the condition for all the 26 alphabets i.e. for a, b, c, d,..,z. So this way you don't have to traverse all the string.

0人赞添加讨论(0) 举报

在下西门庆

5楼-- · 2020-03-01 18:49

Option 2 is the best of the three - Hashing is faster than searching.

However, there's an even faster method, if you have enough memory for it.

Take advantage of the fact that a character set is limited and already enumerated, and keep track of what's appeared and what hasn't as you check each character.

For example, if you're using one-byte chars, there are only 256 possibilities. You would only need 256 bits to keep track as you read through the string. If the character 0x00 occurs, flip the first bit. If the character 0x05 occurs, flip the sixth bit, and so on. When an already-flipped bit is encountered, the string isn't unique.

It's worst case O(min(n, m)) where n is the length of the string, and m is the size of the character set.

And of course, as I saw in another person's comment, if n > m (i.e. length of string > size of character set), then by pigeon-hole principle, there is a repeated character, determinable in O(1) time.

0人赞添加讨论(0) 举报

一纸荒年 Trace。

6楼-- · 2020-03-01 18:50

I would suggest a variant of (2) - use an array of "character already seen" flags instead of a hashset. As you loop through the string, exit immediately if the current character was already seen.

If you have a bitvector class available (I forget whether Java provides one), you could use that, though the memory saving will not necessarily result in any speed improvement and could easily slow things down.

It's O(n) worst case, though, and could have far better average performance depending on your strings - you may well find that most have a repeat near the start. In fact, strictly speaking, it's O(1) worst case anyway since a string longer than the size of the character set must have repeated characters so you have a constant bound to the number of characters you need to check in each string.

0人赞添加讨论(0) 举报

淡お忘

7楼-- · 2020-03-01 18:52

I like the HashSet idea. It's conceptually simple, and only does one pass through the string. For a simple performance improvement, check the add return value. One thing you should be aware of is that this works by case-folding. in one direction. You could create a wrapper class around Character with different equals semantics to really be case-insensitive.

Interestingly, Apache Commons has a CaseInsensitiveMap (src) which works by upper-casing then lower-casing the key. As you probably know, Java's HashSet is backed by a HashMap.

public static boolean allUnique(String s)
{
  // This initial capacity can be tuned.
  HashSet<Character> hs = new HashSet<Character>(s.length());
  for(int i = 0; i < s.length(); i++)
  {
    if(!hs.add(s.charAt(i).toUpperCase())
      return false;
  }
  return true;
}

0人赞添加讨论(0) 举报

上一页 1 2

Easiest way of checking if a string consists of un

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间