The basic equation of square exponential or RBF kernel is as follows:
Here l is the length scale and sigma is the variance parameter. The length scale controls how two points appear to be similar as it simply magnifies the distance between x and x'. The variance parameter controls how smooth the function is.
I want to optimize/train these parameters (l and sigma) with my training data sets. My training data sets are in the following form:
X: 2-D Cartesian coordinate as input data
y: radio signal strength (RSS) of Wi-Fi device at the 2-D coordinates points as observed output
According to sklearn, the GaussianProcessRegressor class is defined as:
class sklearn.gaussian_process.GaussianProcessRegressor(kernel=None, alpha=1e-10, optimizer=’fmin_l_bfgs_b’, n_restarts_optimizer=0, normalize_y=False, copy_X_train=True, random_state=None)
Here, the optimizer
is a string or callable with L-BFGS-B algorithm as the default optimization algorithm (“fmin_l_bfgs_b”
). The optimizer
can either be one of the internally supported optimizers for optimizing the kernel’s parameters, specified by a string, or an externally defined optimizer passed as a callable. Furthermore, the only available internal optimizer in scikit-learn is fmin_l_bfgs_b
. However, I got to know that scipy package has many more optimizers. Since I wanted to use trust-region-reflective algorithm to optimize the hyper-parameters, I tried to implement the algorithm as follows:
def fun_rosenbrock(Xvariable):
return np.array([10*(Xvariable[1]-Xvariable[0]**2),(1-Xvariable[0])])
Xvariable = [1.0,1.0]
kernel = C(1.0, (1e-5, 1e5)) * RBF(1, (1e-1, 1e3))
trust_region_method = least_squares(fun_rosenbrock,[10,20,30,40,50],bounds=[0,100], method ='trf')
gp = GaussianProcessRegressor(kernel=kernel, optimizer = trust_region_method, alpha =1.2, n_restarts_optimizer=10)
gp.fit(X, y)
Since I couldn't figure out what actually the parameter 'fun' is in my case, I resorted to using rosenbrock function from this example (the example is at bottom of the page). I get the following error in console.
Is my approach of using scipy package to optimize the kernel parameters correct? How can I print the optimized value of the parameters? What is the parameter 'fun' in scipy.optimize.least_squares in my case?
Thank you!
There are three primary problems here:
As a partially working example,ignoring the kernel definition to emphasize the optimizer:
The scipy optimizers return a results object, using the minimization of the rosenbrock test function as an example:
As shown above, the optimized values can be accessed using:
and the resulting value of the function to be minimized:
which is what the 'fun' parameter represents. However now that the optimizer is working internally, you will need to access the resulting function value from scikit-learn: