OLS Regression with groupby

I want to run an OLS regression using pandas and a groupby.

I am trying the following code:

import pandas as pd
from pandas.stats.api import ols

df=pd.read_csv(r'F:\File.csv')
result=df.groupby(['FID']).apply(lambda x: ols(y=df[x['MEAN']], x=df[x['Accum_Prcp'],x['Accum_HDD']]))
print result

but this returns:

File "C:\Users\spotter\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 1150, in _convert_to_indexer
    raise KeyError('%s not in index' % objarr[mask])

    KeyError: '[ 0.84978328  0.72115778  0.53965104  0.52955655  0.73372541  0.64617074\n  0.60040938  0.7147218   0.65533535  0.57980322  0.57382068  0.56543435\n  0.70740831  0.9245337   0.54859569  0.6789395   0.7086157   0.3835853\n  0.54924104  0.80813778  0.83758118  0.22673391  0.26594087  0.63650468\n  0.89889911  0.38324657  0.30235986  0.62922678  0.55219822  0.55950705\n  0.71137557  0.53631811  0.70158798  0.87116361  0.93751381  0.91125518\n  0.80020908  0.75301262  0.82391046  0.77483673  0.63069573  0.44954455\n  0.83578862  0.56338649  0.64236039  0.93270243  0.93077291  0.83847668\n  0.8268959   0.85400317  0.74319769  0.94803537  0.97484929  0.45366017\n  0.80823694  0.82028051  0.63960395  0.63015722  0.73132888  0.55570184\n  0.83265402  0.75009687  0.58207032  0.92064804  0.91058008  0.86726397\n  0.89204098  0.95573514  0.75704367  0.80786363  0.87448548  0.7553715\n  0.88965962  0.82828493  0.82423891  0.81034742  0.90104876  0.78875473\n  0.97369268] not in index'

is there something with my syntax that is incorrect?

to do this without a groupby would be something like this:

result = ols(y=df['MEAN'], x=df[['Accum_HDD','Accum_Prcp']])

and that works correctly.

My dataframe looks like something like this:

FID  Image_Date   MEAN  Accum_Prcp   Accum_HDD
1     19920506     2.0   500.0        1000.0
1     19930506     1.7   450.0        1050.0
2     19920506     2.7   456.0        992.0
2     19930506     1.9   376.0        800.0

标签： python pandas statsmodels

1条回答

贪生不怕死

2楼-- · 2019-05-23 02:51

Try:

grps=df.groupby(['FID'])
for fid, grp in grps:
    ols(y=grp.loc[:, 'MEAN'], x=grp.loc[:, ['Accum_Prcp', 'Accum_HDD']])

0人赞添加讨论(0) 举报

OLS Regression with groupby

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间