-->

Running R's aov() mixed effects model from Pyt

2019-06-05 08:13发布

问题:

First to see if rpy2 was working properly I ran a simple model (stats.lm):

import pandas as pd
from rpy2 import robjects as ro
from rpy2.robjects import pandas2ri
pandas2ri.activate()
from rpy2.robjects.packages import importr
stats = importr('stats')
R = ro.r
df = pd.DataFrame(data={'subject':['1','2','3','4','5','1','2','3','4','5'],'group':['1','1','1','2','2','1','1','1','2','2'],'session':['1','1','1','1','1','2','2','2','2','2'],'covar':['1', '2', '0', '2', '1', '1', '2', '0', '2', '1'],'result':[-6.77,6.11,5.67,-7.679,-0.0930,0.948,2.99,6.93,6.30,9.98]})

rdf=pandas2ri.py2ri(df)
result=stats.lm('result ~ group * session + covar',data=rdf)
print(R.summary(result).rx2('coefficients'))

It was working fine:

                 Estimate Std. Error    t value  Pr(>|t|)
(Intercept)      5.323667   4.458438  1.1940654 0.2984217
group2          -3.729167   5.227982 -0.7133090 0.5150618
session2         1.952667   4.458438  0.4379710 0.6840198
covar1          -5.937500   5.107783 -1.1624418 0.3096835
covar2          -5.023500   5.107783 -0.9834992 0.3810438
group2:session2 10.073333   7.049410  1.4289612 0.2262206

I also checked if my mixed effects model was working properly in R:

df <- read.table(header=T, con <- textConnection('
covar group  result session subject
1     1  -6.770       1       1
2     1   6.110       1       2
0     1   5.670       1       3
2     2  -7.679       1       4
1     2  -0.093       1       5
1     1   0.948       2       1
2     1   2.990       2       2
0     1   6.930       2       3
2     2   6.300       2       4
1     2   9.980       2       5'))
close(con)

mixed <- aov(result ~ group*session + covar + Error(as.factor(subject)/session),data=df)
summary(mixed)

again this seemed to work too:

Error: as.factor(subject)
          Df Sum Sq Mean Sq F value Pr(>F)
group      1   0.65    0.65   0.012  0.924
covar      1  16.68   16.68   0.301  0.638
Residuals  2 110.76   55.38               

Error: as.factor(subject):session
              Df Sum Sq Mean Sq F value Pr(>F)  
session        1  89.46   89.46   8.002 0.0663 .
group:session  1  60.88   60.88   5.446 0.1018  
Residuals      3  33.54   11.18                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Q: why doesn't the mixed effects model work here?

result2=stats.aov('result ~ group*session + covar + Error(as.factor(subject)/session)',data=rdf)
print(R.summary(result2).rx2('coefficients'))

This is the error message:

//anaconda/lib/python2.7/site-packages/rpy2/rinterface/__init__.py:185: RRuntimeWarning: Error: $ operator is invalid for atomic vectors

  warnings.warn(x, RRuntimeWarning)
---------------------------------------------------------------------------
RRuntimeError                             Traceback (most recent call last)
<ipython-input-2-aab76c72fbf3> in <module>()
----> 1 result2=stats.aov('result ~ group*session + covar + Error(as.factor(subject)/session)',data=rdf)
      2 print(R.summary(result2).rx2('coefficients'))

//anaconda/lib/python2.7/site-packages/rpy2/robjects/functions.pyc in __call__(self, *args, **kwargs)
    176                 v = kwargs.pop(k)
    177                 kwargs[r_k] = v
--> 178         return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
    179
    180 pattern_link = re.compile(r'\\link\{(.+?)\}')

//anaconda/lib/python2.7/site-packages/rpy2/robjects/functions.pyc in __call__(self, *args, **kwargs)
    104         for k, v in kwargs.items():
    105             new_kwargs[k] = conversion.py2ri(v)
--> 106         res = super(Function, self).__call__(*new_args, **new_kwargs)
    107         res = conversion.ri2ro(res)
    108         return res

RRuntimeError: Error: $ operator is invalid for atomic vectors

I used the following posts as guidance:

rpy2 - Minimal example of rpy2 regression using pandas data frame

mixed ANOVA in R - https://stats.stackexchange.com/questions/45264/why-does-a-mixed-design-using-rs-aov-need-the-between-subject-factors-specific

回答1:

[Voting up just because you have a nice small and self-contained example.]

The R equivalent of what you are doing with rpy2 is the following (and returns the same error)

> mixed <- aov("result ~ group*session + covar + Error(as.factor(subject)/session)",data=df)
Error: $ operator is invalid for atomic vectors

Formula objects are different than strings.

> class(y ~ x)
[1] "formula"
> class("y ~ x")
[1] "character"

rpy2 has a constructor to build R formulae from Python strings:

from rpy2.robjects import Formula
fml = Formula("y ~ x")

Pass this to aov() instead of the string.