Find the most common String in ArrayList()

2020-02-03 07:28发布

问题:

Is there a way to find the most common String in an ArrayList?

ArrayList<String> list = new ArrayList<>();
list.add("test");
list.add("test");
list.add("hello");
list.add("test");

Should find the word "test" from this list ["test","test","hello","test"]

回答1:

Don't reinvent the wheel and use the frequency method of the Collections class:

public static int frequency(Collection<?> c, Object o)

Returns the number of elements in the specified collection equal to the specified object. More formally, returns the number of elements e in the collection such that (o == null ? e == null : o.equals(e)).

If you need to count the occurrences for all elements, use a Map and loop cleverly :) Or put your list in a Set and loop on each element of the set with the frequency method above. HTH

EDIT / Java 8: If you fancy a more functional, Java 8 one-liner solution with lambdas, try:

Map<String, Long> occurrences = 
  list.stream().collect(Collectors.groupingBy(w -> w, Collectors.counting()));


回答2:

In statistics, this is called the "mode". A vanilla Java 8 solution looks like this:

Stream.of("test","test","hello","test")
      .collect(Collectors.groupingBy(s -> s, Collectors.counting()))
      .entrySet()
      .stream()
      .max(Comparator.comparing(Entry::getValue))
      .ifPresent(System.out::println);

Which yields:

test=3

jOOλ is a library that supports mode() on streams. The following program:

System.out.println(
    Seq.of("test","test","hello","test")
       .mode()
);

Yields:

Optional[test]

(disclaimer: I work for the company behind jOOλ)



回答3:

As per question, Specifically just to get word, not the number of times (i.e. value of key).

String mostRepeatedWord 
    = list.stream()
          .collect(Collectors.groupingBy(w -> w, Collectors.counting()))
          .entrySet()
          .stream()
          .max(Comparator.comparing(Entry::getValue))
          .get()
          .getKey();


回答4:

You can make a HashMap<String,Integer>. If the String already appears in the map, increment its key by one, otherwise, add it to the map.

For example:

put("someValue", 1);

Then, assume it's "someValue" again, you can do:

put("someValue", get("someValue") + 1);

Since the key of "someValue" is 1, now when you put it, the key will be 2.

After that you can easily go through the map and extract the key that has the highest value.

I didn't write a full solution, try to construct one, if you have problems post it in another question. Best practice is to learn by yourself.



回答5:

I think the best way to do it is using maps containing counts.

Map<String, Integer> stringsCount = new HashMap<>();

And iterate over your array filling this map:

for(String s: list)
{
  Integer c = stringsCount.get(s);
  if(c == null) c = new Integer(0);
  c++;
  stringsCount.put(s,c);
}

Finally, you can get the most repeated element iterating over the map:

Map.Entry<String,Integer> mostRepeated = null;
for(Map.Entry<String, Integer> e: stringsCount.entrySet())
{
    if(mostRepeated == null || mostRepeated.getValue()<e.getValue())
        mostRepeated = e;
}

And show the most common string:

if(mostRepeated != null)
        System.out.println("Most common string: " + mostRepeated.getKey());


回答6:

You could use a HashMap<String,Integer>. Looping through the array, you can check for each String if it is not already a Key of your HashMap, add it and set the value to 1, if it is, increase its value by 1.

Then you have a HashMap with all unique Strings and an associated number stating their amount in the array.



回答7:

If somebody need to find most popular from usual String[] array (using Lists):

public String findPopular (String[] array) {
    List<String> list = Arrays.asList(array);
    Map<String, Integer> stringsCount = new HashMap<String, Integer>();
    for(String string: list)
    {
        if (string.length() > 0) {
            string = string.toLowerCase();
            Integer count = stringsCount.get(string);
            if(count == null) count = new Integer(0);
            count++;
            stringsCount.put(string,count);
        }
    }
    Map.Entry<String,Integer> mostRepeated = null;
    for(Map.Entry<String, Integer> e: stringsCount.entrySet())
    {
        if(mostRepeated == null || mostRepeated.getValue()<e.getValue())
            mostRepeated = e;
    }
    try {
        return mostRepeated.getKey();
    } catch (NullPointerException e) {
        System.out.println("Cannot find most popular value at the List. Maybe all strings are empty");
        return "";
    }

}
  • case non-sensitive


回答8:

i know this takes more time to implement but you can use heap data structure by storing in the nodes the count and the string information



回答9:

You can use Guava's Multiset:

ArrayList<String> names = ...

// count names 
HashMultiset<String> namesCounts = HashMultiset.create(names);
Set<Multiset.Entry<String>> namesAndCounts = namesCounts.entrySet();

// find one most common
Multiset.Entry<String> maxNameByCount = Collections.max(namesAndCounts, Comparator.comparing(Multiset.Entry::getCount));

// pick all with the same number of occurrences
List<String> mostCommonNames = new ArrayList<>();
for (Multiset.Entry<String> nameAndCount : namesAndCounts) {
    if (nameAndCount.getCount() == maxNameByCount.getCount()) {
        mostCommonNames.add(nameAndCount.getElement());
    }
}


回答10:

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;

public class StringChecker {

public static void main(String[] args) {
ArrayList<String> string;
string = new ArrayList<>(Arrays.asList("Mah", "Bob", "mah", "bat", "MAh", "BOb"));
Map<String, Integer> wordMap = new HashMap<String, Integer>();

for (String st : string) {
    String input = st.toUpperCase();
    if (wordMap.get(input) != null) {
        Integer count = wordMap.get(input) + 1;
        wordMap.put(input, count);
    } else {
        wordMap.put(input, 1);
    }
}
System.out.println(wordMap);
Object maxEntry = Collections.max(wordMap.entrySet(), Map.Entry.comparingByValue()).getKey();
System.out.println("maxEntry = " + maxEntry);

}



回答11:

With this method, if there is more than one most common elements in your ArrayList, you get back all of them by adding them to a new ArrayList.

public static void main(String[] args) {

 List <String> words = new ArrayList<>() ; 

words.add("cat") ; 
words.add("dog") ; 
words.add("egg") ; 
words.add("chair") ; 
words.add("chair") ; 
words.add("chair") ; 
words.add("dog") ; 
words.add("dog") ;  

Map<String,Integer> count = new HashMap<>() ; 

    for (String word : words) {  /* Counts the quantity of each 
                                      element */
        if (! count.containsKey(word)) {             
            count.put(word, 1 ) ; 
        }

        else {
            int value = count.get(word) ; 
            value++ ; 

            count.put(word, value) ;
        }       
    }

    List <String> mostCommons = new ArrayList<>() ; /* Max elements  */

    for ( Map.Entry<String,Integer> e : count.entrySet() ) {

        if (e.getValue() == Collections.max(count.values() )){
                            /* The max value of count  */

            mostCommons.add(e.getKey()) ;
        }   
    }

    System.out.println(mostCommons);

 }

}


回答12:

There are a lot of answers suggesting HashMaps. I really don't like them, because you have to iterate through them once again anyway. Rather, I would sort the List

Collections.sort(list);

and then loop through it. Something similar to

String prev = null, mostCommon=null;
int num = 0, max = 0;
for (String str:list) {
  if (str.equals(prev)) {
    num++;
  } else {
    if (num>max) {
      max = num;
      mostCommon = str;
    }
    num = 1;
    prev = str;
  }
}

should do it.