Easiest way of checking if a string consists of un

I need to check in Java if a word consists of unique letters (case insensitive). As straight solution is boring, I came up with:

For every char in a string check if indexOf(char) == lastIndexOf(char).
Add all chars to HashSet and check if set size == string length.
Convert a string to a char array, sort it alphabetically, loop through array elements and check if c[i] == c[i+1].

Currently I like #2 the most, seems like the easiest way. Any other interesting solutions?

标签： java algorithm string

12条回答

Melony?

2楼-- · 2020-03-01 18:32

By "unique letters" do you mean merely the standard English set of 26, or are you allowing interesting Unicode? What result do you expect if the string contains a non-letter?

If you're only considering 26 possible letters, and you want to either ignore any non-letter or consider it an automatic fail, the best algorithm is likely this pseudocode:

create present[26] as an array of booleans.
set all elements of present[] to false.
loop over characters of your string
  if character is a letter
    if corresponding element of present[] is true
      return false.
    else
      set corresponding element of present[] to true.
    end if
  else
    handle non-letters
  end if
end loop

The only remaining question is whether your array should actually be an array (requiring 26 operations to zero), or a bitfield (possibly requiring more work to check/set, but can be zeroed in a single operation). I think that the bitfield access will be pretty much comparable to the array lookup, if not faster, so I expect a bitfield is the right answer.

0人赞添加讨论(0) 举报

够拽才男人

3楼-- · 2020-03-01 18:34

First check if the size of string is <=26. If not , String has duplicates. return Else try adding into HashSet, if it fails, string has duplicates return. if size of HashSet is = size of string string has unique characters. If we are not allowed to use any other data structure, and string's internal methods and have to still do it in O(n), then loop thru the String.if i!=myLastIndexof(i), return Duplicates exist.

0人赞添加讨论(0) 举报

我命由我不由天

4楼-- · 2020-03-01 18:35

An improvement on option 2 is to check the boolean flag that the HashSet add method returns. It's true if the object wasn't already there. Though, for this method to be at all useful you'd first have to set the string to all caps or lowercase.

0人赞添加讨论(0) 举报

乱世女痞

5楼-- · 2020-03-01 18:35

           import java.io.*;

                   class unique
                  {
                           public static int[] ascii(String s)
                           {
                                    int length=s.length();
                                    int asci[] = new int[length];
                                    for(int i=0;i<length;i++)
                                    {
                                              asci[i]=(int)s.charAt(i);
                                     }
                              return asci;
                            }
                            public static int[] sort(int a[],int l)
                           {
                                       int j=1,temp;
                                       while(j<=l-1)
                                       {
                                                 temp = a[j];
                                                  int k=j-1;
                                                  while(k>=0 && temp<a[k])
                                                 {
                                                           a[k+1]= a[k];
                                                           k--;
                                                 }
                                                a[k+1]=temp;
                                                j++;
                                       } 
                           return a;
                    }
              public static boolean compare(int a[])
            { 
                     int length=a.length;
                     int diff[] = new int[length-1];
                     boolean flag=true;
                     for(int i=0;i<diff.length;i++)
                    {
                             diff[i]=a[i]-a[i+1];
                             if(diff[i]==0)
                             {
                                        flag=false;
                                        break;
                             }
                             else
                             {
                                      flag=true;
                             }
                     }
                     return flag;
                }
                public static void main(String[] args)         throws IOException 
               {
                 BufferedReader br =new BufferedReader(new InputStreamReader(System.in));
                 String str = null;
                 boolean result = true;
                 System.out.println("Enter your String.....");
                 str = br.readLine();
                 str = str.toLowerCase();
                 int asc[]=ascii(str);
                 int len = asc.length;
                 int comp[]=sort(asc,len);
                 if(result==compare(comp))
                 {
                     System.out.println("The Given String is Unique");
                 }
                 else
                {
                        System.out.println("The Given String is not Unique");
                 }
              }

}

0人赞添加讨论(0) 举报

贼婆χ

6楼-- · 2020-03-01 18:36

I don't like 1. -- it's an O(N²) algorithm. Your 2. is roughly linear, but always traverses the entire string. Your 3. is O(N lg₂ N), with (probably) a relatively high constant -- probably almost always slower than 2.

My preference, however, would be when you try to insert a letter into the set, check whether it was already present, and if it was, you can stop immediately. Given random distribution of letters, this should require scanning only half the string on average.

Edit: both comments are correct that exactly what portion of the string you expect to scan will depend on the distribution and the length -- at some point the string is long enough that a repeat is inevitable, and (for example) one character short of that, the chance is still pretty darned high. In fact, given a flat random distribution (i.e., all characters in the set are equally likely), this should fit closely with the birthday paradox, meaning the chance of a collision is related to the square root of the number of possible characters in the character set. Just for example, if we assumed basic US-ASCII (128 characters) with equal probability, we'd reach a 50% chance of a collision at around 14 characters. Of course, in real strings we could probably expect it sooner than that, since the ASCII characters aren't used with anywhere close to equal frequency in most strings.

0人赞添加讨论(0) 举报

萌系小妹纸

7楼-- · 2020-03-01 18:36

What about using an int to store the bits corresponding to the index of the letter of the alpabhet? or maybe a long to be able to reach 64 distinct symbols.

long mask;
// already lower case
string = string.toLowerCase();
for (int i = 0; i < string.length(); ++i)
{
  int index = 1 << string.charAt(i) - 'a';
  if (mask & index == index)
    return false;

  mask |= index;
}
return true;

This should be < O(n) on average case, O(n) on worst. But I'm not sure how much performant bitwise operations are in Java..

0人赞添加讨论(0) 举报

1 2 下一页

Easiest way of checking if a string consists of un

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间