Installing coreNLP in R

2019-01-28 12:25发布

问题:

I'm following the instructions on this link to use coreNLP https://github.com/statsmaths/coreNLP

However, I found this error

> library(coreNLP)

Error in get(method, envir = home) : 
lazy-load database '/Users/apple/Library/R/3.2/library/coreNLP/R/coreNLP.rdb is  corrupt
In addition: Warning messages:
 1: In .registerS3method(fin[i, 1], fin[i, 2], fin[i, 3], fin[i, 4],  :
 restarting interrupted promise evaluation
 2: In get(method, envir = home) :
 restarting interrupted promise evaluation
 3: In get(method, envir = home) : internal error -3 in R_decompress1
 Error: package or namespace load failed for ‘coreNLP’

回答1:

After encountering the java.lang.UnsupportedClassVersionError: edu/stanford/nlp/pipeline/StanfordCoreNLP : Unsupported major.minor version 52.0 error message:

You need to

  • install java 8, (as superuser),
  • change the default jvm the operating system uses to this jvm (* see below),
  • run R CMD javareconf on the command line, and then
  • set the environment variable LD_LIBRARY_PATH to the directory where libjvm.so is stored.
  • restart R / RStudio

  • make sure that a swap file (or swap partition) exists on your machine. call free to check if there is a line in the output that starts with swap and the values on that line are not zero.

I use ubuntu, my java 8 libjvm.so is here: /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so

You can do this in your .Rprofile file. Add this line, perhaps at the bottom of the file:

Sys.setenv(LD_LIBRARY_PATH=paste0(Sys.getenv("LD_LIBRARY_PATH"), ":", "/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/" ))

When I do this in R:

R> Sys.getenv("LD_LIBRARY_PATH")
[1] "/usr/local/lib64/R/lib:/usr/local/lib64:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/"
R> library(coreNLP)
R> initCoreNLP()

I get this result:

Searching for resource: config.properties
Adding annotator tokenize
TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
Adding annotator ssplit
Adding annotator pos
Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [1.1 sec].
Adding annotator lemma
Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [5.6 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [2.1 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [3.8 sec].
Initializing JollyDayHoliday for SUTime from classpath: edu/stanford/nlp/models/sutime/jollyday/Holidays_sutime.xml as sutime.binder.1.
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/defs.sutime.txt
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.sutime.txt
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.holidays.sutime.txt
Adding annotator parse
Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... done [0.6 sec].
Adding annotator dcoref
Adding annotator sentiment

R> example(getSentiment)

gtSntmR> getSentiment(annoEtranger) # first Sentence of L'Etranger by A.Camus
  id sentimentValue sentiment
1  1              1  Negative
2  2              2   Neutral

gtSntmR> getSentiment(annoHp) # first Sentence of Harry Potter V1
  id sentimentValue    sentiment
1  1              4 Verypositive

(*) How to see the default jvm on Linux:

update-alternatives --display java

Result

java - auto mode
  link currently points to /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java

To show all available alternatives, use

update-alternatives --list java

Result (on my machine):

/usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java
/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java

Change alternatives:

sudo update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java

Just play a bit with update-alternatives.



回答2:

> install.packages('devtools')
> devtools::install_github("statsmaths/coreNLP")
> download.file("http://nlp.stanford.edu/software/stanford-corenlp-full-2015-01-29.zip", '/path/to/save/stanford-corenlp-full-2015-01-29.zip')
> unzip('/path/to/save/stanford-corenlp-full-2015-01-29.zip')

The instructions above from https://github.com/statsmaths/coreNLP works, possibly something went it was installing the library in R

Re-run these command to reinstall the corenlp wrapper:

> install.packages('devtools')
> devtools::install_github("statsmaths/coreNLP")

You should see this if the package is not corrupted:

> devtools::install_github("statsmaths/coreNLP")
Downloading GitHub repo statsmaths/coreNLP@master
Installing coreNLP
'/usr/lib/R/bin/R' --no-site-file --no-environ --no-save --no-restore CMD  \
  INSTALL '/tmp/RtmpFS9LWl/devtools667a3cdbc084/statsmaths-coreNLP-3a667c6'  \
  --library='/home/expert/R/x86_64-pc-linux-gnu-library/3.2' --install-tests 

* installing *source* package ‘coreNLP’ ...
** R
** data
*** moving datasets to lazyload DB
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (coreNLP)
Reloading installed coreNLP

Otherwise devtools should do a reinstall of the package.