Word Count no duplicates

2019-09-21 20:59发布

Here is my word count program using java. I need to reprogram this so that something, something; something? something! and something count as one word. That means it should not count the same word twice irregardless of case and punctuation.

import java.util.Scanner;
public class WordCount1
{
    public static void main(String[]args)
    {
        final int Lines=6;
        Scanner in=new Scanner (System.in);
        String paragraph = "";
        System.out.println( "Please input "+ Lines + " lines of text.");
        for (int i=0; i < Lines; i+=1)
        {
            paragraph=paragraph+" "+in.nextLine();
        }
        System.out.println(paragraph);
        String word="";
        int WordCount=0;
        for (int i=0; i<paragraph.length()-1; i+=1)
        {
            if (paragraph.charAt(i) != ' ' || paragraph.charAt(i) !=',' || paragraph.charAt(i)    !=';' || paragraph.charAt(i) !=':' )
            {
                word= word + paragraph.charAt(i);
                if(paragraph.charAt(i+1)==' ' || paragraph.charAt(i) ==','|| paragraph.charAt(i) ==';' || paragraph.charAt(i) ==':')
                {
                    WordCount +=1;
                    word="";
                }
            }
        }
        System.out.println("There are "+WordCount +" words ");
    }
}

标签: java count word
8条回答
Root(大扎)
2楼-- · 2019-09-21 21:11

Here You Go. This Works. Just Read The Comments And You Should Be Able To Follow.

import java.util.Arrays;
import java.util.HashSet;
import javax.swing.JOptionPane;

// Program Counts Words In A Sentence. Duplicates Are Not Counted.
public class WordCount
{
    public static void main(String[]args)
    {
        // Initialize Variables
        String sentence = "";
        int wordCount = 1, startingPoint = 0;


        // Prompt User For Sentence
        sentence = JOptionPane.showInputDialog(null, "Please input a sentence.", "Input Information Below", 2);


        // Remove All Punctuations. To Check For More Punctuations Just Add Another Replace Statement.
        sentence = sentence.replace(",", "").replace(".", "").replace("?", "");


        // Convert All Characters To Lowercase - Must Be Done To Compare Upper And Lower Case Words.
        sentence = sentence.toLowerCase();


        // Count The Number Of Words
        for (int i = 0; i < sentence.length(); i++)
            if (sentence.charAt(i) == ' ')
                wordCount++;


        // Initialize Array And A Count That Will Be Used As An Index
        String[] words = new String[wordCount];
        int count = 0;


        // Put Each Word In An Array
        for (int i = 0; i < sentence.length(); i++)
        {
            if (sentence.charAt(i) == ' ')
            {
                words[count] = sentence.substring(startingPoint,i);
                startingPoint = i + 1;
                count++;
            }
        }


        // Put Last Word In Sentence In Array
        words[wordCount - 1] = sentence.substring(startingPoint, sentence.length());


        // Put Array Elements Into A Set. This Will Remove Duplicates
        HashSet<String> wordsInSet = new HashSet<String>(Arrays.asList(words));


        // Format Words In Hash Set To Remove Brackets, And Commas, And Convert To String
        String wordsString = wordsInSet.toString().replace(",", "").replace("[", "").replace("]", "");


        // Print Out None Duplicate Words In Set And Word Count
        JOptionPane.showMessageDialog(null, "Words In Sentence:\n" + wordsString + " \n\n" +
                                                "Word Count: " + wordsInSet.size(), "Sentence Information", 2);
    }
}
查看更多
不美不萌又怎样
3楼-- · 2019-09-21 21:12
  1. remove all punctuations
  2. convert all strings to lowercase OR uppercase
  3. put those strings in a set
  4. get the size of the set
查看更多
疯言疯语
4楼-- · 2019-09-21 21:14

Since this is homework, here are some hints and advice.

  • There is a clever little method called String.split that splits a string into parts, using a separator specified as a regular expression. If you use it the right way, this will give you a one line solution to the "word count" problem. (If you've been told not to use split, you can ignore that ... though it is the simple solution that a seasoned Java developer would consider first.)

  • Format / indent your code properly ... before you show it to other people. If your instructor doesn't deduct marks for this, he / she isn't doing his job properly.

  • Use standard Java naming conventions. The capitalization of Lines is incorrect. It could be LINES for a manifest constant or lines for variable, but a mixed case name starting with a capital letter should always be a class name.

  • Be consistent in your use of white space characters around operators (including the assignment operator).

  • It is a bad idea (and completely unnecessary) to hard wire the number of lines of input that the user must supply. And you are not dealing with the case where he / supplies less than 6 lines.

查看更多
Ridiculous、
5楼-- · 2019-09-21 21:18
  1. You'll need to strip out the punctuation; here's one approach: Translating strings character by character

  2. The above can also be used to normalize the case, although there are probably other utilities for doing so.

  3. Now all of the variations you describe will be converted to the same string, and thus be recognized as such. As pretty much everyone else has suggested, as set would be a good tool for counting the number of distinct words.

查看更多
Fickle 薄情
6楼-- · 2019-09-21 21:18

As you parse your input string, store it word by word in a map data structure. Just ensure that "word", "word?" "word!" all are stored with the key "word" in the map, and increment the word's count whenever you have to add to the map.

查看更多
Ridiculous、
7楼-- · 2019-09-21 21:26

What your real problem is, is that you want to have a Distinct wordcount, so, you should either keep track of which words allready encountered, or delete them from the text entirely.

Lets say that you choose the first one, and store the words you already encountered in a List, then you can check against that list whether you allready saw that word.

List<String> encounteredWords = new ArrayList<String>();
// continue after that you found out what the word was
if(!encounteredWords.contains(word.toLowerCase()){
    encounteredWords.add(word.toLowerCase());
    wordCount++;
}

But, Antimony, made a interesting remark as well, he uses the property of a Set to see what the distinct wordcount is. It is defined that a set can never contain duplicates, so if you just add more of the same word, the set wont grow in size.

Set<String> wordSet = new HashSet<String>();
// continue after that you found out what the word was
wordSet.add(word.toLowerCase());
// continue after that you scanned trough all words
return wordSet.size();
查看更多
登录 后发表回答