我有许多样品(y_i, (a_i, b_i, c_i))
到一定程度。 例如,对于一个给定数据集和2度,我可能生成模型
y = a^2 + 2ab - 3cb + c^2 +.5ac
这可以通过使用最小二乘法进行,并且是numpy的的polyfit程序的轻微扩展。 有没有一个标准的实施某处Python的生态系统?
我有许多样品(y_i, (a_i, b_i, c_i))
到一定程度。 例如,对于一个给定数据集和2度,我可能生成模型
y = a^2 + 2ab - 3cb + c^2 +.5ac
这可以通过使用最小二乘法进行,并且是numpy的的polyfit程序的轻微扩展。 有没有一个标准的实施某处Python的生态系统?
建立关张贴的例子在这里 :
#X is the independent variable (bivariate in this case)
X = array([[0.44, 0.68], [0.99, 0.23]])
#vector is the dependent data
vector = [109.85, 155.72]
#predict is an independent variable for which we'd like to predict the value
predict= [0.49, 0.18]
#generate a model of polynomial features
poly = PolynomialFeatures(degree=2)
#transform the x data for proper fitting (for single variable type it returns,[1,x,x**2])
X_ = poly.fit_transform(X)
#transform the prediction to fit the model type
predict_ = poly.fit_transform(predict)
#here we can remove polynomial orders we don't want
#for instance I'm removing the `x` component
X_ = np.delete(X_,(1),axis=1)
predict_ = np.delete(predict_,(1),axis=1)
#generate the regression object
clf = linear_model.LinearRegression()
#preform the actual regression
clf.fit(X_, vector)
print("X_ = ",X_)
print("predict_ = ",predict_)
print("Prediction = ",clf.predict(predict_))
>>> X_ = [[ 0.44 0.68 0.1936 0.2992 0.4624]
>>> [ 0.99 0.23 0.9801 0.2277 0.0529]]
>>> predict_ = [[ 0.49 0.18 0.2401 0.0882 0.0324]]
>>> Prediction = [ 126.84247142]
polyfit的工作,但有更好的最小二乘极小那里。 我会建议kmpfit,可在
def model(p, v, x, w):
a,b,c,d,e,f,g,h,i,j,k = p #coefficients to the polynomials
return a*v**2 + b*x**2 + c*w**2 + d*v*x + e*v*w + f*x*w + g*v + h*x + i*y + k
def residuals(p, data): # Function needed by fit routine
v, x, w, z = data # The values for v, x, w and the measured hypersurface z
a,b,c,d,e,f,g,h,i,j,k = p #coefficients to the polynomials
return (z-model(p,v,x,w)) # Returns an array of residuals.
#This should (z-model(p,v,x,w))/err if
# there are error bars on the measured z values
#initial guess at parameters. Avoid using 0.0 as initial guess
par0 = [1.0, 1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0]
#create a fitting object. data should be in the form
#that the functions above are looking for, i.e. a Nx4
#list of lists/tuples like (v,x,w,z)
fitobj = kmpfit.Fitter(residuals=residuals, data=data)
# call the fitter
The success of these things is closely dependent on the starting values for the fit, so chose carefully if possible. With so many free parameters it could be a challenge to get a solution.
sklearn有使用他们的管道一个很好的例子在这里 。 下面是他们的榜样的核心:
polynomial_features = PolynomialFeatures(degree=degrees[i],
linear_regression = LinearRegression()
pipeline = Pipeline([("polynomial_features", polynomial_features),
("linear_regression", linear_regression)])
pipeline.fit(X[:, np.newaxis], y)
你并不需要自己转换数据 - 只是把它传递到管道。