I'm unable to make the following code work, though I don't see this error working strictly in R.
from rpy2.robjects.packages import importr
from rpy2 import robjects
import numpy as np
forecast = importr('forecast')
ts = robjects.r['ts']
y = np.random.randn(50)
X = np.random.randn(50)
y = ts(robjects.FloatVector(y), start=robjects.IntVector((2004, 1)), frequency=12)
X = ts(robjects.FloatVector(X), start=robjects.IntVector((2004, 1)), frequency=12)
forecast.Arima(y, xreg=X, order=robjects.IntVector((1, 0, 0)))
It's especially confusing considering the following code works fine
forecast.auto_arima(y, xreg=X)
I see the following traceback no matter what I give for X, using numpy interface or not. Any ideas?
---------------------------------------------------------------------------
RRuntimeError Traceback (most recent call last)
<ipython-input-20-b781220efb93> in <module>()
13 X = ts(robjects.FloatVector(X), start=robjects.IntVector((2004, 1)), frequency=12)
14
---> 15 forecast.Arima(y, xreg=X, order=robjects.IntVector((1, 0, 0)))
/home/skipper/.local/lib/python2.7/site-packages/rpy2/robjects/functions.pyc in __call__(self, *args, **kwargs)
84 v = kwargs.pop(k)
85 kwargs[r_k] = v
---> 86 return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
/home/skipper/.local/lib/python2.7/site-packages/rpy2/robjects/functions.pyc in __call__(self, *args, **kwargs)
33 for k, v in kwargs.iteritems():
34 new_kwargs[k] = conversion.py2ri(v)
---> 35 res = super(Function, self).__call__(*new_args, **new_kwargs)
36 res = conversion.ri2py(res)
37 return res
RRuntimeError: Error in `colnames<-`(`*tmp*`, value = if (ncol(xreg) == 1) nmxreg else paste(nmxreg, :
length of 'dimnames' [2] not equal to array extent
Edit:
The problem is that the following lines of code do not evaluate to a column name, which seems to be the expectation on the R side.
sub = robjects.r['substitute']
deparse = robjects.r['deparse']
deparse(sub(X))
I don't know well enough what the expectations of this code should be in R, but I can't find an RPy2 object that passes this check by returning something of length == 1
. This really looks like a bug to me.
R> length(deparse(substitute((rep(.2, 1000)))))
[1] 1
But in Rpy2
[~/]
[94]: robjects.r.length(robjects.r.deparse(robjects.r.substitute(robjects.r('rep(.2, 1000)'))))
[94]:
<IntVector - Python:0x7ce1560 / R:0x80adc28>
[ 78]
there is a way to just simply pass your variables to R without sub-situations and return the results back to python. You can find a simple example here https://stackoverflow.com/a/55900840/5350311 . I guess it is more clear what you are passing to R and what you will get back in return, specially if you are working with For loops and large number of variables.
This is one manifestation (see this other related issue for example) of the same underlying issue: R expressions are evaluated lazily and can be manipulated within R and this leads to idioms that do not translate well (in Python expression are evaluated immediately, and one has to move to the AST to manipulate code).
An answers to the second part of your question. In R,
substitute(rep(.2, 1000))
is passing the unevaluated expressionrep(.2, 1000)
tosubstitute()
. Doing inrpy2
is passing a string; the R equivalent would be
The following is letting you get close to R's
deparse(substitute())
:Currently, one way to work about this is to bind R objects to R symbols (preferably in a dedicated environment rather than in GlobalEnv), and use the symbols in an R call written as a string:
This is not something I am happy about as a solution, but I have never found the time to work on a better solution.
edit: With rpy2-2.4.0 it becomes possible to use R symbols and do the following:
This is not yet the most intuitive interface. May be something using a context manager would be better.