I need to use Wordnet in a java-based app. I want to:
search synsets
find similarity/relatedness between synsets
My app uses RDF graphs and I know there are SPARQL endpoints with Wordnet, but I guess it's better to have a local copy of the dataset, as it's not too big.
I've found the following jars:
- General library - JAWS http://lyle.smu.edu/~tspell/jaws/index.html
- General library - JWNL http://sourceforge.net/projects/jwordnet
- Similarity library (Perl) - Wordnet::similarity http://wn-similarity.sourceforge.net/
- Java version of Wordnet::similarity http://www.cogs.susx.ac.uk/users/drh21/ (beta)
What would you recommend for my app?
Is it possible to use a Perl library from a java app via some bindings?
Thanks! Mulone
I am not sure if either JAWS or JWNL provide methods to calculate similarity between synsets, but I have tried both for searching synsets and I've found JAWS easier to use. Specifically, the simple:
was easier for me to understand than JWNL's file_properties.xml requirement.
There is function in JAWS to find similar wordForms Here are details:
public AdjectiveSynset[] getSimilar() throws WordNetException and here is link that you can check out: http://lyle.smu.edu/~tspell/jaws/doc/edu/smu/tspell/wordnet/AdjectiveSynset.html this link it contails details that you can use.
I use JAWS for normal wordnet stuff because it's easy to use. For similarity metrics, though, I use the library located here. You'll also need to download this folder, containing pre-processed WordNet and corpus data, for it to work. The code can be used like this, assuming you placed that folder in another called "lib" in your project folder:
This will print something like the following, showing the similarity score between each possible combination of synsets represented by the words to be compared:
There are also methods that allow you to specify which sense of either/both words:
res(String word1, int senseNum1, String word2, partOfSpeech)
, etc. Unfortunately, the source documentation is not JavaDoc, so you'll need to inspect it manually. The source can be downloaded here.The available algorithms are:
Also, it requires you to have the jar file for MIT's JWI