Java text analysis libraries

2019-02-02 12:08发布

问题:

I'm looking for a java driven solution to a requirement for analysing sentences to log whether a key word was used positively or negatively.

Ie The key word might be 'cabbages' and the sentence:-

'I like cabbages but not peas'

And I'd like a java text analyser of some kind to log this as positive. Can the lucene (Hibernate-Search) libraries be utilized to for this?

Any thoughts?

回答1:

You're looking for "sentiment analysis". One possibility is LingPipe, who kindly link to their competitors also. Jeff Dalton also has a great list of natural language processing tools in his blog.



回答2:

I doubt there's anything like that. Lucene definitely can't do it out of the box.

How do you even define "whether a key word was used positively or negatively" in a way that can be evaluated programmatically? To do it properly, you'd have to analyse the text for their actual meaning, which is an AI problem that is not even remotely solved.

I suppose you could solve it approximately by just doing a statistical analysis of whether the keyword appears more often close to positive (like, good, great, wonderful) or negative (bad, hate, crappy, damn) keywords, but even there, negations, sarcasm and complex sentence structures will be problematic.



回答3:

Take a look at Mahout Taste, which builds on Lucene but adds a lot of what you need out of the box. (edit) I should add, Mahout Taste is merely related to what you're looking for and not a 100% match.