I am trying to pass a numpy array to the GAMLSS package in R.
import numpy as np
import rpy2.robjects as robjects
from rpy2.robjects import numpy2ri
numpy2ri.activate()
r = robjects.r
r.library("gamlss")
r.library("gamlss.mx")
L = r['data.frame'](np.array(np.random.normal(size=1000),
dtype=([('x', np.float), ('y', np.float), ('z', np.float)])))
r.gamlssMX(robjects.Formula('z~1'), data=L)
Running this returns
Error in y0 - f0 : non-conformable arrays
Yet I can pass the data frame to the linear model R function.
lm = r.lm(robjects.Formula('x~y'), data=L)
print r.summary(lm.rx())
I have got a load of code that reads a binary file in Python but would like to use the R package, hence the need for rpy2.
-- EDIT --
As an example in R:
x <- data.frame(z=c(rnorm(1000), rnorm(1000, mean=4)))
gamlssMX(z~1, K=1, data=x)
Looks like it is a bug, if I use the now depreciated pandas.rpy.common.convert_to_r_dataframe
, it works fine:
But the currently preferred method raises error:
import numpy as np
import rpy2.robjects as robjects
from rpy2.robjects import pandas2ri
import pandas.rpy.common as com
robjects.reval("library('gamlss')")
robjects.reval("library('gamlss.mx')")
R =pd.DataFrame({'x': np.random.random(2000)})
A1 = pandas2ri.pandas2ri(R)
A2 = com.convert_to_r_dataframe(R)
robjects.r.assign('B1', A1)
robjects.r.assign('B2', A2)
robjects.reval("m <- gamlssMX(x~1, K=1, data=B1)") #won't work
robjects.reval("m <- gamlssMX(x~1, K=1, data=B2)") #works fine
There is only one line of difference: use com.convert_to_r_dataframe
or pandas2ri.pandas2ri
. Looks like the current version has a bug.
The newer pandas2ri.pandas2ri
method results in rpy2.robjects.vectors.Array
and the older com.convert_to_r_dataframe
results in rpy2.robjects.vectors.FloatVector
.
In [3]:
robjects.r.B1
Out[3]:
<DataFrame - Python:0x10e868a28 / R:0x10f425238>
[Array]
x: <class 'rpy2.robjects.vectors.Array'>
<Array - Python:0x10e868b48 / R:0x10f425400>
[0.051728, 0.149642, 0.884797, ..., 0.485063, 0.733193, 0.134963]
In [4]:
robjects.r.B2
Out[4]:
<DataFrame - Python:0x10e868cf8 / R:0x110e1b918>
[FloatVector]
x: <class 'rpy2.robjects.vectors.FloatVector'>
<FloatVector - Python:0x10e868e18 / R:0x10f442400>
[0.051728, 0.149642, 0.884797, ..., 0.485063, 0.733193, 0.134963]
Looks like gamlss
raise an exception when the data vector is Array
instead of FloatVector
.