Class GaussianProcessRegression<T>
- Type Parameters:
T
- the data type of model input objects.
- All Implemented Interfaces:
Serializable
,ToDoubleFunction<T>
,Regression<T>
A Gaussian process can be used as a prior probability distribution over functions in Bayesian inference. Given any set of N points in the desired domain of your functions, take a multivariate Gaussian whose covariance matrix parameter is the Gram matrix of N points with some desired kernel, and sample from that Gaussian. Inference of continuous values with a Gaussian process prior is known as Gaussian process regression.
The fitting is performed in the reproducing kernel Hilbert space with the "kernel trick". The loss function is squared-error. This also arises as the kriging estimate of a Gaussian random field in spatial statistics.
A significant problem with Gaussian process prediction is that it typically
scales as O(n3). For large problems (e.g. n > 10,000
) both
storing the Gram matrix and solving the associated linear systems are
prohibitive on modern workstations. An extensive range of proposals have
been suggested to deal with this problem. A popular approach is the
reduced-rank Approximations of the Gram Matrix, known as Nystrom
approximation. Subset of Regressors (SR) is another popular approach
that uses an active set of training samples of size m selected from
the training set of size n > m
. We assume that it is impossible
to search for the optimal subset of size m due to combinatorics.
The samples in the active set could be selected randomly, but in general
we might expect better performance if the samples are selected greedily
w.r.t. some criterion. Recently, researchers had proposed relaxing the
constraint that the inducing variables must be a subset of training/test
cases, turning the discrete selection problem into one of continuous
optimization.
Experimental evidence suggests that for large m the SR and Nystrom methods have similar performance, but for small m the Nystrom method can be quite poor. Also, embarrassments can occur like the approximated predictive variance being negative. For these reasons we do not recommend the Nystrom method over the SR method.
References
- Carl Edward Rasmussen and Chris Williams. Gaussian Processes for Machine Learning, 2006.
- Joaquin Quinonero-candela, Carl Edward Ramussen, Christopher K. I. Williams. Approximation Methods for Gaussian Process Regression. 2007.
- T. Poggio and F. Girosi. Networks for approximation and learning. Proc. IEEE 78(9):1484-1487, 1990.
- Kai Zhang and James T. Kwok. Clustered Nystrom Method for Large Scale Manifold Learning and Dimension Reduction. IEEE Transactions on Neural Networks, 2010.
- See Also:
-
Nested Class Summary
Modifier and TypeClassDescriptionclass
The joint prediction of multiple data points.Nested classes/interfaces inherited from interface smile.regression.Regression
Regression.Trainer<T,
M extends Regression<T>> -
Field Summary
Modifier and TypeFieldDescriptionfinal MercerKernel
<T> The covariance/kernel function.final double
The log marginal likelihood, which may be not available (NaN) when the model is fit with approximate methods.final double
The mean of responsible variable.final double
The variance of noise.final T[]
The regressors.final double
The standard deviation of responsible variable.final double[]
The linear weights. -
Constructor Summary
ConstructorDescriptionGaussianProcessRegression
(MercerKernel<T> kernel, T[] regressors, double[] weight, double noise) Constructor.GaussianProcessRegression
(MercerKernel<T> kernel, T[] regressors, double[] weight, double noise, double mean, double sd) Constructor.GaussianProcessRegression
(MercerKernel<T> kernel, T[] regressors, double[] weight, double noise, double mean, double sd, Matrix.Cholesky cholesky, double L) Constructor. -
Method Summary
Modifier and TypeMethodDescriptionstatic GaussianProcessRegression
<double[]> fit
(double[][] x, double[] y, Properties params) Fits a regular Gaussian process model.static <T> GaussianProcessRegression
<T> fit
(T[] x, double[] y, MercerKernel<T> kernel, double noise) Fits a regular Gaussian process model by the method of subset of regressors.static <T> GaussianProcessRegression
<T> fit
(T[] x, double[] y, MercerKernel<T> kernel, double noise, boolean normalize, double tol, int maxIter) Fits a regular Gaussian process model.static <T> GaussianProcessRegression
<T> fit
(T[] x, double[] y, MercerKernel<T> kernel, Properties params) Fits a regular Gaussian process model.static <T> GaussianProcessRegression
<T> fit
(T[] x, double[] y, T[] t, MercerKernel<T> kernel, double noise) Fits an approximate Gaussian process model by the method of subset of regressors.static <T> GaussianProcessRegression
<T> fit
(T[] x, double[] y, T[] t, MercerKernel<T> kernel, double noise, boolean normalize) Fits an approximate Gaussian process model by the method of subset of regressors.static <T> GaussianProcessRegression
<T> fit
(T[] x, double[] y, T[] t, MercerKernel<T> kernel, Properties params) Fits an approximate Gaussian process model by the method of subset of regressors.static <T> GaussianProcessRegression
<T> nystrom
(T[] x, double[] y, T[] t, MercerKernel<T> kernel, double noise) Fits an approximate Gaussian process model with Nystrom approximation of kernel matrix.static <T> GaussianProcessRegression
<T> nystrom
(T[] x, double[] y, T[] t, MercerKernel<T> kernel, double noise, boolean normalize) Fits an approximate Gaussian process model with Nystrom approximation of kernel matrix.static <T> GaussianProcessRegression
<T> nystrom
(T[] x, double[] y, T[] t, MercerKernel<T> kernel, Properties params) Fits an approximate Gaussian process model with Nystrom approximation of kernel matrix.double
Predicts the dependent variable of an instance.double
Predicts the mean and standard deviation of an instance.Evaluates the Gaussian Process at some query points.toString()
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface smile.regression.Regression
applyAsDouble, online, predict, predict, predict, update, update, update
-
Field Details
-
kernel
The covariance/kernel function. -
regressors
The regressors. -
w
public final double[] wThe linear weights. -
mean
public final double meanThe mean of responsible variable. -
sd
public final double sdThe standard deviation of responsible variable. -
noise
public final double noiseThe variance of noise. -
L
public final double LThe log marginal likelihood, which may be not available (NaN) when the model is fit with approximate methods.
-
-
Constructor Details
-
GaussianProcessRegression
public GaussianProcessRegression(MercerKernel<T> kernel, T[] regressors, double[] weight, double noise) Constructor.- Parameters:
kernel
- Kernel function.regressors
- The regressors.weight
- The weights of regressors.noise
- The variance of noise.
-
GaussianProcessRegression
public GaussianProcessRegression(MercerKernel<T> kernel, T[] regressors, double[] weight, double noise, double mean, double sd) Constructor.- Parameters:
kernel
- Kernel function.regressors
- The regressors.weight
- The weights of regressors.noise
- The variance of noise.mean
- The mean of responsible variable.sd
- The standard deviation of responsible variable.
-
GaussianProcessRegression
public GaussianProcessRegression(MercerKernel<T> kernel, T[] regressors, double[] weight, double noise, double mean, double sd, Matrix.Cholesky cholesky, double L) Constructor.- Parameters:
kernel
- Kernel function.regressors
- The regressors.weight
- The weights of regressors.noise
- The variance of noise.mean
- The mean of responsible variable.sd
- The standard deviation of responsible variable.cholesky
- The Cholesky decomposition of kernel matrix.L
- The log marginal likelihood.
-
-
Method Details
-
predict
Description copied from interface:Regression
Predicts the dependent variable of an instance.- Specified by:
predict
in interfaceRegression<T>
- Parameters:
x
- an instance.- Returns:
- the predicted value of dependent variable.
-
predict
Predicts the mean and standard deviation of an instance.- Parameters:
x
- an instance.estimation
- an output array of the estimated mean and standard deviation.- Returns:
- the estimated mean value.
-
query
Evaluates the Gaussian Process at some query points.- Parameters:
samples
- query points.- Returns:
- The mean, standard deviation and covariances of GP at query points.
-
toString
-
fit
Fits a regular Gaussian process model.- Parameters:
x
- the training dataset.y
- the response variable.params
- the hyper-parameters.- Returns:
- the model.
-
fit
public static <T> GaussianProcessRegression<T> fit(T[] x, double[] y, MercerKernel<T> kernel, Properties params) Fits a regular Gaussian process model.- Type Parameters:
T
- the data type of samples.- Parameters:
x
- the training dataset.y
- the response variable.kernel
- the Mercer kernel.params
- the hyper-parameters.- Returns:
- the model.
-
fit
public static <T> GaussianProcessRegression<T> fit(T[] x, double[] y, MercerKernel<T> kernel, double noise) Fits a regular Gaussian process model by the method of subset of regressors.- Type Parameters:
T
- the data type of samples.- Parameters:
x
- the training dataset.y
- the response variable.kernel
- the Mercer kernel.noise
- the noise variance, which also works as a regularization parameter.- Returns:
- the model.
-
fit
public static <T> GaussianProcessRegression<T> fit(T[] x, double[] y, MercerKernel<T> kernel, double noise, boolean normalize, double tol, int maxIter) Fits a regular Gaussian process model.- Type Parameters:
T
- the data type of samples.- Parameters:
x
- the training dataset.y
- the response variable.kernel
- the Mercer kernel.noise
- the noise variance, which also works as a regularization parameter.normalize
- the flag if normalize the response variable.tol
- the stopping tolerance for HPO.maxIter
- the maximum number of iterations for HPO. No HPO ifmaxIter <= 0
.- Returns:
- the model.
-
fit
public static <T> GaussianProcessRegression<T> fit(T[] x, double[] y, T[] t, MercerKernel<T> kernel, Properties params) Fits an approximate Gaussian process model by the method of subset of regressors.- Type Parameters:
T
- the data type of samples.- Parameters:
x
- the training dataset.y
- the response variable.t
- the inducing input, which are pre-selected or inducing samples acting as active set of regressors. In simple case, these can be chosen randomly from the training set or as the centers of k-means clustering.kernel
- the Mercer kernel.params
- the hyper-parameters.- Returns:
- the model.
-
fit
public static <T> GaussianProcessRegression<T> fit(T[] x, double[] y, T[] t, MercerKernel<T> kernel, double noise) Fits an approximate Gaussian process model by the method of subset of regressors.- Type Parameters:
T
- the data type of samples.- Parameters:
x
- the training dataset.y
- the response variable.t
- the inducing input, which are pre-selected or inducing samples acting as active set of regressors. In simple case, these can be chosen randomly from the training set or as the centers of k-means clustering.kernel
- the Mercer kernel.noise
- the noise variance, which also works as a regularization parameter.- Returns:
- the model.
-
fit
public static <T> GaussianProcessRegression<T> fit(T[] x, double[] y, T[] t, MercerKernel<T> kernel, double noise, boolean normalize) Fits an approximate Gaussian process model by the method of subset of regressors.- Type Parameters:
T
- the data type of samples.- Parameters:
x
- the training dataset.y
- the response variable.t
- the inducing input, which are pre-selected or inducing samples acting as active set of regressors. In simple case, these can be chosen randomly from the training set or as the centers of k-means clustering.kernel
- the Mercer kernel.noise
- the noise variance, which also works as a regularization parameter.normalize
- the option to normalize the response variable.- Returns:
- the model.
-
nystrom
public static <T> GaussianProcessRegression<T> nystrom(T[] x, double[] y, T[] t, MercerKernel<T> kernel, Properties params) Fits an approximate Gaussian process model with Nystrom approximation of kernel matrix.- Type Parameters:
T
- the data type of samples.- Parameters:
x
- the training dataset.y
- the response variable.t
- the inducing input, which are pre-selected for Nystrom approximation.kernel
- the Mercer kernel.params
- the hyper-parameters.- Returns:
- the model.
-
nystrom
public static <T> GaussianProcessRegression<T> nystrom(T[] x, double[] y, T[] t, MercerKernel<T> kernel, double noise) Fits an approximate Gaussian process model with Nystrom approximation of kernel matrix.- Type Parameters:
T
- the data type of samples.- Parameters:
x
- the training dataset.y
- the response variable.t
- the inducing input, which are pre-selected for Nystrom approximation.kernel
- the Mercer kernel.noise
- the noise variance, which also works as a regularization parameter.- Returns:
- the model.
-
nystrom
public static <T> GaussianProcessRegression<T> nystrom(T[] x, double[] y, T[] t, MercerKernel<T> kernel, double noise, boolean normalize) Fits an approximate Gaussian process model with Nystrom approximation of kernel matrix.- Type Parameters:
T
- the data type of samples.- Parameters:
x
- the training dataset.y
- the response variable.t
- the inducing input, which are pre-selected for Nystrom approximation.kernel
- the Mercer kernel.noise
- the noise variance, which also works as a regularization parameter.normalize
- the option to normalize the response variable.- Returns:
- the model.
-