I have a String that I have to parse for different keywords. For example, I have the String:
"I will come and meet you at the 123woods"
And my keywords are
'123woods' 'woods'
I should report whenever I have a match and where. Multiple occurrences should also be accounted for. However, for this one, I should get a match only on 123woods, not on woods. This eliminates using String.contains() method. Also, I should be able to have a list/set of keywords and check at the same time for their occurrence. In this example, if I have '123woods' and 'come', I should get two occurrences. Method execution should be somewhat fast on large texts.
My idea is to use StringTokenizer but I am unsure if it will perform well. Any suggestions?
Looking back at the original question, we need to find some given keywords in a given sentence, count the number of occurrences and know something about where. I don't quite understand what "where" means (is it an index in the sentence?), so I'll pass that one... I'm still learning java, one step at a time, so I'll see to that one in due time :-)
It must be noticed that common sentences (as the one in the original question) can have repeated keywords, therefore the search cannot just ask if a given keyword "exists or not" and count it as 1 if it does exist. There can be more then one of the same. For example:
By looking at it, the expected result would be 5 for "Say" + "come" + "you" + "say" + "123woods", counting "say" twice if we go lowercase. If we don't, then the count should be 4, "Say" being excluded and "say" included. Fine. My suggestion is:
And the results are:
Found: Say
Found: come
Found: you
Found: say
Found: 123woods
In sentence: Say that 123 of us will come by and meet you, say, at the woods of 123woods.
Count: 5
To Match "123woods" instead of "woods" , use atomic grouping in regular expresssion. One thing to be noted is that, in a string to match "123woods" alone , it will match the first "123woods" and exits instead of searching the same string further.
it searches 123woods as primary search, once it got matched it exits the search.
You can also use regex matching with the \b flag (whole word boundary).
How about something like
Arrays.asList(String.split(" ")).contains("xx")
?See String.split() and How can I test if an array contains a certain value.
A much simpler way to do this is to use split():
This is a simpler, less elegant way to do the same thing without using tokens, etc.