how to find the most frequent character in a big s

2019-08-08 18:41发布

问题:

i'm working on an assignment that i have to find the most four frequent characters in a string. i write this so far.

import java.util.Scanner;
public class FindTheMostOccur
{
   public static void main (String[] args) 
   {
      String input;
      String example = "how to find the most frequent character in a big string using    java?";

      String[] array = new String[example.length()];

      for (int i = 97; i < 123; i++)

      {
         int mostFrequent =0;

         for( int j = 0; j < example.length(); j++)
         {
            if(example.charAt(j) == i)
            {
               ++mostFrequent;
            }

         }

            System.out.println( (char)i + " is showing " + mostFrequent+ " times ");

      }
   }
}

Here is the output for this example.

a is showing 5 times

b is showing 1 times

c is showing 2 times

d is showing 1 times

e is showing 4 times

f is showing 2 times

g is showing 3 times

h is showing 3 times

i is showing 5 times

j is showing 1 times

k is showing 0 times

l is showing 0 times

m is showing 1 times

n is showing 5 times

o is showing 3 times

p is showing 0 times

q is showing 1 times

r is showing 4 times

s is showing 3 times

t is showing 6 times

u is showing 2 times

v is showing 1 times

w is showing 1 times

x is showing 0 times

y is showing 0 times

in this examle : t, a,i,n

I DON"T NEED SOMEONE TO COMPLETE THE PROGRAM FOR ME, however i need some ideas how to find the most four frequent character in this example.

回答1:

The simplest algorithm I can think of off hand is that instead of doing multiple passes do a single pass.

Create a HashMap mapping from character to count.

Each time you find a character if it is not in the map, add it with value 1. If it is in the map increment the count.

At the end of the loop your HashMap now contains the count for every character in the String.

Take the EntrySet from the HashMap and dump it into a new ArrayList.

Sort the ArrayList with a custom comparator that uses entry.getValue() to compare by.

The first (or last depending on the sort direction) values in the ArrayList will be the ones you want.



回答2:

How about this?

int count[] = new int[1000];// all with zero

Then for each character from the string, do count[]++ like this way

count['c']++;
count['A']++;

At the end, find out which index holds the maximum value. Then just print the ascii of that index.



回答3:

Ok, here're some ideas:

  • To find the 4 most frequent characters, you must first know the frequency for all the characters. So, you would need to store the frequency somewhere. Currently, you are just printing the frequency of each character. You can't compare that way. So, you need a data structure.

  • Here we are talking about mapping from a chararacter to its count. Possibly you need a hash table. Look out for the hash table implementation in Java. You will find HashMap, LinkedHashMap, TreeMap.

  • Since you want the 4 most frequent characters, you need to order the characters by their frequency. Find out what kind of map stores the elements in sorted order. Note: You need to sort the map by values, not keys. And sort it in descending order, which is obvious.

Using Array:

  • There is another approach you can follow. You can create an array of size 26, assuming you only want to count frequency of alphabetic characters.

    int[] frequency = new int[26];
    
  • Each index of that array correspond to the frequency of a single character. Iterate over the string, and increment the index corresponding to the current character by 1.

  • You can do character to index mapping like this:

    int index = ch - 'a';
    

    So, for character 'a', the index will be 0, for character 'b', index will be 1, so on. You might also want to take care of case sensitivity.

Problem with the array approach is, once you sort the array to get 4 most frequent character, you've lost the character to index mapping. So, you would need to have some way to sort the indices along with the frequency at those indices. Yes you're right. You need another array here.

But, will having 2 arrays make your problem easy? No. Because it's difficult to sort 2 arrays in synchronization. So what to do? You can create a class storing index and character. Maintain an array of that class reference, of size 26. Then find your way out to port the array approach to this array.



回答4:

What data structures are you allowed to use? I'm thinking that you can use a priority queue and increment the priority of the node containing the character that is being repeated. And when you are done, the first index would contain the node with the most repeated character



回答5:

I got another idea that you can do it using an array only. Well you have 26 letters in the Alphabet. The ASCII code of the lowercase letters range from 97 to 122(you can make every letter a lower case using .lowercase() or something similar). So for each character, get its ASCII code, do the ASCII code -97 and increment it. Simple,and only need an array.



标签: java string char