Wordnet Lemmatizer for R

2020-06-30 05:02发布

问题:

I would like to use the wordnet lemmatizer to lemmatize the words in a

> a<-c("He saw a see-saw on a sea shore", "she is feeling cold")
> a
[1] "He saw a see-saw on a sea shore" "she is feeling cold"  

I convert a into a corpus and do pre-processing steps (like stopword removal, lemmatization etc)

> a <- Corpus(VectorSource(a))

I wanted to do the lemmatization in the below way,

> filter <- getTermFilter("ExactMatchFilter", a, TRUE)
> terms <- getIndexTerms("NOUN", 1, filter)
> sapply(terms, getLemma)

but I get this error

> filter <- getTermFilter("ExactMatchFilter", a, TRUE)
Error in .jnew(paste("com.nexagis.jawbone.filter", type, sep = "."), word,  : 
  java.lang.NoSuchMethodError: <init>

My idea is to lemmatize the whole corpus and not a single word, How can it be accomplished?

回答1:

Put you code in a loop, you can try something like this:

       lapply(a,function(x){
            x.filter <- getTermFilter("ExactMatchFilter", x, TRUE))
            terms <- getIndexTerms("NOUN", 1, x.filter)
            sapply(terms, getLemma)
         })