Predicting values using an OLS model with statsmod

2019-04-23 14:31发布

I calculated a model using OLS (multiple linear regression). I divided my data to train and test (half each), and then I would like to predict values for the 2nd half of the labels.

model = OLS(labels[:half], data[:half])
predictions = model.predict(data[half:])

The problem is that I get and error: File "/usr/local/lib/python2.7/dist-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/regression/linear_model.py", line 281, in predict return np.dot(exog, params) ValueError: matrices are not aligned

I have the following array shapes: data.shape: (426, 215) labels.shape: (426,)

If I transpose the input to model.predict, I do get a result but with a shape of (426,213), so I suppose its wrong as well (I expect one vector of 213 numbers as label predictions):

model.predict(data[half:].T)

Any idea how to get it to work?

1条回答
【Aperson】
2楼-- · 2019-04-23 15:08

For statsmodels >=0.4, if I remember correctly

model.predict doesn't know about the parameters, and requires them in the call see http://statsmodels.sourceforge.net/stable/generated/statsmodels.regression.linear_model.OLS.predict.html

What should work in your case is to fit the model and then use the predict method of the results instance.

model = OLS(labels[:half], data[:half])
results = model.fit()
predictions = results.predict(data[half:])

or shorter

results = OLS(labels[:half], data[:half]).fit()
predictions = results.predict(data[half:])

http://statsmodels.sourceforge.net/stable/generated/statsmodels.regression.linear_model.RegressionResults.predict.html with missing docstring

Note: this has been changed in the development version (backwards compatible), that can take advantage of "formula" information in predict http://statsmodels.sourceforge.net/devel/generated/statsmodels.regression.linear_model.RegressionResults.predict.html

查看更多
登录 后发表回答