我正在尝试创建一个复合协方差函数来对我的数据进行建模。具体来说,我想创建一个在@covSEard
&之间加权的内核@covRQard
。例如:我想给 30% 的权重@covSEard
和 70% 的权重@covRQard
,比如0.3*@covSEard + 0.7*@covRQard
我尝试了以下选项。
选项 1 - 使用@covSum
&@covProd
功能
input_dimensions = 4; % number of input dimensions
seed = 1234; % seed for reproducibility
rng(seed);
X = rand(100, input_dimensions); % sample input data generation
y = rand(100, 1); % sample output data generation
addpath(genpath(PATH_TO_GPML_TOOLBOX)); % adding gpml toolbox to the path
meanFunc = {@meanZero}; % using a zero mean
meanHyp = {[]}; % zero mean has no hyper parameters
kernel_weight = 0.3; % weight of covSEard kernel, 0.7 will be the weight for covRQard kernel
% defining the covariance function as a weighted sum of covSEard & covRQard
covFunc = {'covSum', {{'covProd', {'covConst','covSEard'}}, {'covProd', {'covConst','covRQard'}}}};
% variables to define the hyperparameters for the above kernel
sf=2; L=rand(input_dimensions,1); al=2;
% Covariance function for a constant function. The covariance function is parameterized as:
% k(x,z) = sf^2
%
% The scalar hyperparameter is:
% hyp = [ log(sf) ]
covHyp = {log([sqrt(kernel_weight); L; sf; sqrt(1-kernel_weight); L; sf; al])};
likFunc = {@likGauss}; % Using a gaussian likelihood
likHyp = {-1}; % Likelihood hyper parameter initialization
infMethod= @infGaussLik; % Using Gaussian inference
iters = -300; % Number of iterations for Bayesian Optimization
% Defining the hyper parameter struct
hyp.lik = cell2mat(likHyp(1));
hyp.cov = cell2mat(covHyp(1));
hyp.mean = cell2mat(meanHyp(1));
% Defining mean, covariance and likelihood functions
mF = meanFunc{1,1}; cF=covFunc; lF=likFunc{1,1};
hyp2vfe = minimize(hyp, @gp, iters, infMethod, mF, cF, lF, X, y); % Optimization of hyperparameters / Training
[nll, ~] = gp(hyp2vfe, infMethod, mF, cF, lF, X, y); % Negative Log Likelihood calculation
在这里,我尝试使用@covConst
只有信号方差超参数的内核,我强制它等于权重(例如:在这种情况下@covSEard
为 0.3 和 0.7 。
但是,当我尝试优化以上内核的超参数,甚至权重(实际上是内核的超参数)也被修改了。@covRQard
@covConst
选项 2 - 使用@covSum
功能并根据权重重复每个内核 n 次
例如,如果我想分别给@covSEard
&赋予 1:2 的权重@covRQard
,我会执行以下操作
替换上面代码中的以下几行
covFunc = {'covSum', {{'covProd', {'covConst','covSEard'}}, {'covProd', {'covConst','covRQard'}}}};
和
covfunc = {'covSum', {'covSEard', 'covRQard', 'covRQard'}};
&
covHyp = {log([sqrt(kernel_weight); L; sf; sqrt(1-kernel_weight); L; sf; al])};
和
covHyp = {log([L; sf; L; sf; al; L; sf; al])};
但是,在这种情况下,超参数的数量会线性增加,而且我不确定这是否是正确的做事方式。
我想知道在 GPML 中创建这种协方差函数的正确方法是什么。请建议。