I need my Java program to take a string like:
"This is a sample sentence."
and turn it into a string array like:
{"this","is","a","sample","sentence"}
No periods, or punctuation (preferably). By the way, the string input is always one sentence.
Is there an easy way to do this that I'm not seeing? Or do we really have to search for spaces a lot and create new strings from the areas between the spaces (which are words)?
string.replaceAll() doesn't correctly work with locale different from predefined. At least in jdk7u10.
This example creates a word dictionary from textfile with windows cyrillic charset CP1251
I already did post this answer somewhere, i will do it here again. This version doesn't use any major inbuilt method. You got the char array, convert it into a String. Hope it helps!
You can just split your string like that using this regular expression
Use
string.replace(".", "").replace(",", "").replace("?", "").replace("!","").split(' ')
to split your code into an array with no periods, commas, question marks, or exclamation marks. You can add/remove as many replace calls as you want.You can also use
BreakIterator.getWordInstance
.Try this: