I'm trying to take a string input, parse each word to all lowercase and print each word on a line (in sorted order), ignoring non-alphabetic characters (single letter words count as well). So,
Sample input:
Adventures in Disneyland
Two blondes were going to Disneyland when they came to a fork in the
road. The sign read: "Disneyland Left."
So they went home.
Output:
a
adventures
blondes
came
disneyland
fork
going
home
in
left
read
road
sign
so
the
they
to
two
went
were
when
My program:
Scanner reader = new Scanner(file);
ArrayList<String> words = new ArrayList<String>();
while (reader.hasNext()) {
String word = reader.next();
if (word != "") {
word = word.toLowerCase();
word = word.replaceAll("[^A-Za-z ]", "");
if (!words.contains(word)) {
words.add(word);
}
}
}
Collections.sort(words);
for (int i = 0; i < words.size(); i++) {
System.out.println(words.get(i));
}
This works for the input above, but prints the wrong output for an input like this:
a t\|his@ is$ a)( -- test's-&*%$#-`case!@|?
The expected output should be
a
case
his
is
s
t
test
The output I get is
*a blank line is printed first*
a
is
testscase
this
So, my program obviously doesn't work since scanner.next() takes in characters until it hits a whitespace and considers that a string, whereas anything that is not a letter should be treated as a break between words. I'm not sure how I might be able to manipulate Scanner methods so that breaks are considered non-alphabetic characters as opposed to whitespace, so that's where I'm stuck right now.
The other answer has already mentioned some issues with your code.
I suggest another approach to address your requirements. Such transformations are a good use case for Java Streams – it often yields clean code:
Here are the steps:
Split the string by one or more subsequent characters not being alphabetic;
This yields tokens consistint solely of alphabetic characters.
Stream over the resulting array using
Arrays.stream()
;Map each element to their lowercase equivalent:
The default locale is used. Use
toLowerCase(Locale)
to explicitly set the locale.Discard duplicates using
Stream.distinct()
.Sort the elements within the stream by simply calling
sorted()
;Collect the elements into a
List
withcollect()
.If you need to read it from a file, you could use this:
But if you need to use a
Scanner
, then you could be using something like this:And then
Don't use
==
or!=
for comparingString
(s). Also, perform your transform before you check for empty. This,should look something like