RStudio doesn't load all Python modules via rP

2019-06-16 21:38发布

问题:

I have some unexpected behaviours running the same script from Bash and from within RStudio.

Please consider the following. I have a folder "~/rpython" containing two scripts:

# test1.R

library(rPython)

setwd("~/rpython")

python.load("test1.py")

number <- python.get("number")
string <- python.get("string")

print(sqrt(number))
print(string)

and

# test1.py

import random, nltk

number = random.randint(1, 1000)

string = nltk.word_tokenize('home sweet home')

I can call my R script from Bash with Rscript test1.R, which returns as expected

>> Loading required package: RJSONIO
>> [1] 13.0384
>> [1] "home"  "sweet" "home"

and if I call it again will produce a different random number

>> Loading required package: RJSONIO
>> [1] 7.211103
>> [1] "home"  "sweet" "home" 

But when I run the very same script (test1.R) from RStudio things get weird. Here the output

# test1.R
> 
> library(rPython)
Loading required package: RJSONIO
> 
> setwd("~/rpython")
> 
> python.load("test1.py")
Error in python.exec(code, get.exception) : No module named nltk
> 
> number <- python.get("number")
Traceback (most recent call last):
  File "<string>", line 1, in <module>
NameError: name 'number' is not defined
Error in python.get("number") : Variable not found
> string <- python.get("string")
Traceback (most recent call last):
  File "<string>", line 1, in <module>
NameError: name 'string' is not defined
Error in python.get("string") : Variable not found
> 
> print(sqrt(number))
Error in print(sqrt(number)) : object 'number' not found
> print(string)
Error in print(string) : object 'string' not found

For some reason when I call the script from RStudio, the Python interpreter can't locate the module nltk (it seems to be the same with other pip installed modules) but has no problem importing random.

回答1:

I had this problem, too. The issue was that my bash terminal seems to be calling a different python than the one Rstudio is. I also learned that if you're only trying to call Python.load() from rPython, you're probably better off with system() from the base R library.

  1. Figure out which python your bash terminal is calling. Go to your bash terminal and run which python. For me (OS X 10.11.5) it was /usr/local/bin/python. Now that we know the full path, we can call it explicitly and prevent R from choosing another version that might be installed in some corner of your machine.
  2. Use system() to call bash commands from R instead of python.load(), and use the full path to your script. Using your example script name, and my example python path, it would be system('/usr/local/bin/python /path/to/file/test.py1')

Hope that helps!