【回归预测-ELM预测】基于遗传算法优化极限学习机实现风电数据回归预测附matlab代码

原创已于 2022-08-02 16:48:47 修改 · 881 阅读

4 ·

CC 4.0 BY-SA版权

文章标签：

#matlab

于 2022-07-29 14:54:32 首次发布

机器学习组合优化系列matlab完整代码专栏收录该内容

448 篇文章

订阅专栏

1 内容介绍

风电功率预测为电网规划提供重要的依据,研究风电功率预测方法对确保电网在安全稳定运行下接纳更多的风电具有重要的意义.针对极限学习机(ELM)回归模型预测结果受输入参数影响的问题,现将遗传优化算法应用于ELM中,提出了一种基于遗传优化极限学习机的风功率预测方法.该方法首先将数值天气预报信息(NWP)数据进行数据预处理,并构建出训练样本集,随后建立ELM模型,利用遗传算法优化ELM中的输入权值和阈值,从而建立起基于NWP和GA-ELM风功率预测模型.对华东地区3个不同装机容量的风场NWP数据进行实验.结果表明:该方法的预测精度高且稳定性能好,能够为风电场功率预测以及风电并网安全可靠性提供科学有效的参考依据.

遗传算法是以自然选择和遗传理论为基础，将生物进化过程中适者生存规则与种群内部染色体的随机信息交换机制相结合的高效全局寻优搜索算法。该算法是把问题参数编码为染色体，再利用迭代的方式进行选择，交叉以及变异等运算来交换种群中的染色体的信息。从而使群体代代进化到搜索空间中越来越好的区域，直至抵达最优解点。在遗传算法中，染色体对应的是数据或者数组，一般由结构串数据组成来表示。串上各个位置对应基因的取值。由一系列基因组成的串就是染色体，或者称之为基因型个体，一定数量的个体又组成为群体，群体中的数目称之为群体大小，而各个个体对环境的适应程度称之为适应度。

2 仿真代码

% BS2RV.m - Binary string to real vector % % This function decodes binary chromosomes into vectors of reals. The % chromosomes are seen as the concatenation of binary strings of given % length, and decoded into real numbers in a specified interval using % either standard binary or Gray decoding. % % Syntax: Phen = bs2rv(Chrom,FieldD) % % Input parameters: % % Chrom - Matrix containing the chromosomes of the current % population. Each line corresponds to one % individual's concatenated binary string % representation. Leftmost bits are MSb and % rightmost are LSb. % % FieldD - Matrix describing the length and how to decode % each substring in the chromosome. It has the % following structure: % % [len; (num) % lb; (num) % ub; (num) % code; (0=binary | 1=gray) % scale; (0=arithmetic | 1=logarithmic) % lbin; (0=excluded | 1=included) % ubin]; (0=excluded | 1=included) % % where % len - row vector containing the length of % each substring in Chrom. sum(len) % should equal the individual length. % lb, % ub - Lower and upper bounds for each % variable. % code - binary row vector indicating how each % substring is to be decoded. % scale - binary row vector indicating where to % use arithmetic and/or logarithmic % scaling. % lbin, % ubin - binary row vectors indicating whether % or not to include each bound in the % representation range % % Output parameter: % % Phen - Real matrix containing the population phenotypes.

% % Author: Carlos Fonseca, Updated: Andrew Chipperfield % Date: 08/06/93, Date: 26-Jan-94

function Phen = bs2rv(Chrom,FieldD)

% Identify the population size (Nind) % and the chromosome length (Lind) [Nind,Lind] = size(Chrom);

% Identify the number of decision variables (Nvar) [seven,Nvar] = size(FieldD);

if seven ~= 7 error('FieldD must have 7 rows.'); end

% Get substring properties len = FieldD(1,:); lb = FieldD(2,:); ub = FieldD(3,:); code = ~(~FieldD(4,:)); scale = ~(~FieldD(5,:)); lin = ~(~FieldD(6,:)); uin = ~(~FieldD(7,:));

% Check substring properties for consistency if sum(len) ~= Lind, error('Data in FieldD must agree with chromosome length'); end

if ~all(lb(scale).*ub(scale)>0) error('Log-scaled variables must not include 0 in their range'); end

% Decode chromosomes Phen = zeros(Nind,Nvar);

lf = cumsum(len); li = cumsum([1 len]); Prec = .5 .^ len;

logsgn = sign(lb(scale)); lb(scale) = log( abs(lb(scale)) ); ub(scale) = log( abs(ub(scale)) ); delta = ub - lb;

Prec = .5 .^ len; num = (~lin) .* Prec; den = (lin + uin - 1) .* Prec;

for i = 1:Nvar, idx = li(i):lf(i); if code(i) % Gray decoding Chrom(:,idx)=rem(cumsum(Chrom(:,idx)')',2); end Phen(:,i) = Chrom(:,idx) * [ (.5).^(1:len(i))' ]; Phen(:,i) = lb(i) + delta(i) * (Phen(:,i) + num(i)) ./ (1 - den(i)); end

expand = ones(Nind,1); if any(scale) Phen(:,scale) = logsgn(expand,:) .* exp(Phen(:,scale)); end

function [inputs,ordered,costs,sig2n,model] = bay_lssvmARD(model,type,btype,nb); % Bayesian Automatic Relevance Determination of the inputs of an LS-SVM % % % >> dimensions = bay_lssvmARD({X,Y,type,gam,sig2}) % >> dimensions = bay_lssvmARD(model) % % For a given problem, one can determine the most relevant inputs % for the LS-SVM within the Bayesian evidence framework. To do so, % one assigns a different weighting parameter to each dimension in % the kernel and optimizes this using the third level of % inference. According to the used kernel, one can remove inputs % corresponding the larger or smaller kernel parameters. This % routine only works with the 'RBF_kernel' with a sig2 per % input. In each step, the input with the largest optimal sig2 is % removed (backward selection). For every step, the generalization % performance is approximated by the cost associated with the third % level of Bayesian inference. % % The ARD is based on backward selection of the inputs based on the % sig2s corresponding in each step with a minimal cost % criterion. Minimizing this criterion can be done by 'continuous' % or by 'discrete'. The former uses in each step continuous varying % kernel parameter optimization, the latter decides which one to % remove in each step by binary variables for each component (this % can only applied for rather low dimensional inputs as the number % of possible combinations grows exponentially with the number of % inputs). If working with the 'RBF_kernel', the kernel parameter % is rescaled appropriately after removing an input variable. % % The computation of the Bayesian cost criterion can be based on % the singular value decomposition 'svd' of the full kernel matrix % or by an approximation of these eigenvalues and vectors by the % 'eigs' or 'eign' approximation based on 'nb' data points. % % Full syntax % % 1. Using the functional interface: % % >> [dimensions, ordered, costs, sig2s] = bay_lssvmARD({X,Y,type,gam,sig2,kernel,preprocess}) % >> [dimensions, ordered, costs, sig2s] = bay_lssvmARD({X,Y,type,gam,sig2,kernel,preprocess}, method) % >> [dimensions, ordered, costs, sig2s] = bay_lssvmARD({X,Y,type,gam,sig2,kernel,preprocess}, method, type) % >> [dimensions, ordered, costs, sig2s] = bay_lssvmARD({X,Y,type,gam,sig2,kernel,preprocess}, method, type, nb) % % Outputs % dimensions : r x 1 vector of the relevant inputs % ordered(*) : d x 1 vector with inputs in decreasing order of relevance % costs(*) : Costs associated with third level of inference in every selection step % sig2s(*) : Optimal kernel parameters in each selection step % Inputs % X : N x d matrix with the inputs of the training data % Y : N x 1 vector with the outputs of the training data % type : 'function estimation' ('f') or 'classifier' ('c') % gam : Regularization parameter % sig2 : Kernel parameter (bandwidth in the case of the 'RBF_kernel') % kernel(*) : Kernel type (by default 'RBF_kernel') % preprocess(*) :'preprocess'(*) or 'original' % method(*) : 'discrete'(*) or 'continuous' % type(*) : 'svd'(*), 'eig', 'eigs', 'eign' % nb(*) :Number of eigenvalues/eigenvectors used in the eigenvalue decomposition approximation % % 2. Using the object oriented interface: % % >> [dimensions, ordered, costs, sig2s, model] = bay_lssvmARD(model) % >> [dimensions, ordered, costs, sig2s, model] = bay_lssvmARD(model, method) % >> [dimensions, ordered, costs, sig2s, model] = bay_lssvmARD(model, method, type) % >> [dimensions, ordered, costs, sig2s, model] = bay_lssvmARD(model, method, type, nb) % % Outputs % dimensions : r x 1 vector of the relevant inputs % ordered(*) : d x 1 vector with inputs in decreasing order of relevance % costs(*) : Costs associated with third level of inference in every selection step % sig2s(*) : Optimal kernel parameters in each selection step % model(*) : Object oriented representation of the LS-SVM model trained only on the relevant inputs % Inputs % model : Object oriented representation of the LS-SVM model % method(*) : 'discrete'(*) or 'continuous' % type(*) : 'svd'(*), 'eig', 'eigs', 'eign' % nb(*) : Number of eigenvalues/eigenvectors used in the eigenvalue decomposition approximation % % See also: % bay_lssvm, bay_optimize, bay_modoutClass, bay_errorbar

warning off eval('type;','type=''discrete'';'); eval('btype;','btype=''svd'';');

if ~(type(1)=='d' | type(1)=='c'), error('type needs to be ''continuous'' or ''discrete''...'); end

if ~(strcmpi(btype,'svd') | strcmpi(btype,'eig') | strcmpi(btype,'eigs') | strcmpi(btype,'eign')), error('Eigenvalue decomposition via ''svd'', ''eig'', ''eigs'' or ''eign''.'); end

% % initiate model % if ~isstruct(model), model = initlssvm(model{:}); end

'OPTIMIZING GAMMA AND KERNEL PARAMETERS WITH BAYESIAN FRAMEWORK OVER ALL INPUTS...'

%model = changelssvm(model, 'kernel_type', 'RBF_kernel'); eval('[model,kernel_pars,bay] = bay_optimize(model,3,btype,nb);',... '[model,kernel_pars,bay] = bay_optimize(model,3,btype);');

costs(1) = bay.costL3;

% % init parameters % eval('nb;','nb=inf;'); xdim = model.x_dim; all = 1:xdim; reject = zeros(xdim,1);

% % continuous % if type(1)=='c',

if length(model.kernel_pars)~=model.x_dim, model = changelssvm(model,'kernel_pars',model.kernel_pars(1)*ones(1,model.x_dim)); end

sig2n = zeros(xdim-1,xdim); [Xtrain,Ytrain] = postlssvm(model,model.xtrain(model.selector,:),model.ytrain(model.selector,:)); for d=1:xdim-1, ['testing for ' num2str(xdim-d+1) ' inputs'] [modelf,sig2n(d,1:(xdim-d+1)),bay] = ... bay_optimize({Xtrain(:,all), Ytrain,model.type,model.gam, model.kernel_pars(:,all),model.kernel_type, model.preprocess},... 3,btype,nb) costs(d+1,:) = bay.costL3; [m,reject(d)] = max(sig2n(d,:)); all = setdiff(all,reject(d)); all=reshape(all,1,length(all)); ['SELECTED INPUT(S) (''continuous''): [' num2str(all) ']'] end reject(xdim) = all; % % discrete % elseif type(1)=='d',

if length(model.kernel_pars)>1, error('only 1 kernel parameter supported for the moment, use ''fmin'' instead;'); end [Xtrain,Ytrain] = postlssvm(model,model.xtrain(model.selector,:), ... model.ytrain(model.selector,:));

% % cost for all % [c3,bay] = bay_lssvm({Xtrain, Ytrain,... model.type,model.gam, model.kernel_pars,... model.kernel_type, model.preprocess}, 3,btype,nb); costs(1,:) = bay.costL3; % % iteration % for d=1:xdim-1,

['testing for ' num2str(xdim-d+1) ' inputs'] % rescaling of kernel parameters if strcmp(model.kernel_type,'RBF_kernel'), % RBF model.kernel_pars = (model.x_dim-d)./model.x_dim.*model.kernel_pars; else % else model = bay_optimize({Xtrain(:,all), Ytrain,... model.type,model.gam, model.kernel_pars,model.kernel_type, model.preprocess},3,btype,nb); end

% which input to remove? minc3 = inf; for a = 1:length(all), [c3,bayf] = bay_lssvm({Xtrain(:,all([1:a-1 a+1:end])), Ytrain,... model.type,model.gam, model.kernel_pars,... model.kernel_type, model.preprocess}, 3,btype,nb); if c3<minc3, minc3=c3;reject(d)=all(a);bay=bayf; end end % remove input d... all = setdiff(all,reject(d));all=reshape(all,1,length(all)); costs(d+1) = bay.costL3; %save ARD_ff ['SELECTED INPUT(S) (''discrete''): [' num2str(all) ']']

end reject(xdim) = all; end

% % select best reduction (costL2 lowest) % [mcL2,best] = min(costs); ordered = reject(end:-1:1); inputs = ordered(1:xdim-(best-1)); eval('mkp = model.kernel_pars(:,inputs);','mkp = model.kernel_pars;'); model = initlssvm(Xtrain(:,inputs),Ytrain,model.type,model.gam, mkp, model.kernel_type, model.preprocess);

warning on

function model = changelssvm(model,option, value) % Change a field of the object oriented representation of the LS-SVM % % % The different options of the fields are given in following table: % % 1. General options representing the kind of model: % % type: 'classifier' ,'function estimation' % implementation: 'CMEX' ,'CFILE' ,'MATLAB' % status: Status of this model ('trained' or 'changed' ) % alpha: Support values of the trained LS-SVM model % b: Bias term of the trained LS-SVM model % duration: Number of seconds the training lasts % latent: Returning latent variables ('no' ,'yes' ) % x_delays: Number of delays of eXogeneous variables (by default 0 ) % y_delays: Number of delays of responses (by default 0 ) % steps: Number of steps to predict (by default 1 ) % gam: Regularisation parameter % kernel_type: Kernel function % kernel_pars: Extra parameters of the kernel function % % % 2. Fields used to specify the used training data: % % x_dim: Dimension of input space % y_dim: Dimension of responses % nb_data: Number of training data % xtrain: (preprocessed) inputs of training data % ytrain: (preprocessed,coded) outputs of training data % selector: Indexes of training data effectively used during training % % % 3. Options used in the Conjugate Gradient (CG) algorithm: % % cga_max_itr: Maximum number of iterations in CG % cga_eps: Stopcriterium of CG, largest allowed error % cga_fi_bound: Stopcriterium of CG, smallest allowed improvement % cga_show: Show the results of the CG algorithm (1 or 0) % cga_startvalues: Starting values of the CG algorithm % % % 4. Fields with the information for pre- and post-processing (only given if appropriate): % % preprocess: 'preprocess' or 'original' % schemed: Status of the preprocessing % ('coded' ,'original' or 'schemed' ) % pre_xscheme: Scheme used for preprocessing the input data % pre_yscheme: Scheme used for preprocessing the output data % pre_xmean: Mean of the input data % pre_xstd: Standard deviation of the input data % pre_ymean: Mean of the responses % pre_ystd: Standard deviation of the reponses % % % 5. The specifications of the used encoding (only given if appropriate): % % code: Status of the coding % ('original' ,'changed' or 'encoded') % codetype: Used function for constructing the encoding % for multiclass classification (by default 'none') % codetype_args: Arguments of the codetype function % codedist_fct: Function used to calculate to which class a % coded result belongs % codedist_args: Arguments of the codedist function % codebook2: Codebook of the new coding % codebook1: Codebook of the original coding % % Full syntax % % >> model = changelssvm(model, field, value) % % Outputs % model(*) : Obtained object oriented representation of the LS-SVM model % Inputs % model : Original object oriented representation of the LS-SVM model % field : Field of the model one wants to change (e.g. 'preprocess') % value : New value of the field of the model one wants to change % % See also: % trainlssvm, initlssvm, simlssvm, plotlssvm.

% % alias sigma^2 % if (strcmpi(option,'sig2')) option = 'kernel_pars'; end

% % selector -> nb_data % nb_data -> selector % if strcmp(option,'selector'), model.nb_data = length(value); end if strcmp(option,'nb_data'), model.selector = 1:value; end

% % xtrain % if strcmp(option,'xtrain'), [nb,model.x_dim] = size(value); model.nb_data = nb;%min(nb,model.nb_data); model.selector = 1:model.nb_data; if length(model.gam)>model.y_dim & length(model.gam)~=size(value,1), warning('Discarting different gamma''s...'); model.gam = max(model.gam); end eval('value=prelssvm(model,value);',... 'warning(''new trainings inputdata not comform with used preprocessing'');'); end

% % ytrain % if strcmp(option,'ytrain'), if size(value,2)~=size(model.ytrain,2), model.y_dim = size(value,2); end eval('value = codelssvm(model,[],value);',... 'warning(''new trainings outputdata not comform with used encoding;'');'); eval('[ff,value] = prelssvm(model,[],value);',... 'warning(''new trainings outputdata not comform with used preprocessing;'');'); [nb,model.y_dim] = size(value); model.nb_data = min(nb,model.nb_data); model.selector = 1:model.nb_data; end

% % switch between preprocessing - original data % model.prestatus = {'changed','ok'} % if (strcmpi(option,'preprocess')) & model.preprocess(1)~=value(1), model.prestatus = 'changed'; end

% % change coding % if strcmpi(option,'codetype') | strcmpi(option,'codebook2') | ... strcmpi(option, 'codeargs') | strcmpi(option, 'codedistfct'), model.code = 'changed'; elseif strcmpi(option,'codebook1'), warning('change original format of the classifier; the toolbox will be unable to return results in the original format'); end

% % final change % eval(['old_value = model.' lower(option) ';'],'old_value=[];'); eval(['model.' lower(option) '=value;']);

if (isempty(value) | isempty(old_value)), different = 1; else eval('different = any(old_value~=value);','different=1;'); end

if different & ~strcmpi(option,'implementation'), model.status = 'changed'; end

warning off eval('type;','type=''discrete'';'); eval('btype;','btype=''svd'';');

if ~(type(1)=='d' | type(1)=='c'), error('type needs to be ''continuous'' or ''discrete''...'); end

if ~(strcmpi(btype,'svd') | strcmpi(btype,'eig') | strcmpi(btype,'eigs') | strcmpi(btype,'eign')), error('Eigenvalue decomposition via ''svd'', ''eig'', ''eigs'' or ''eign''.'); end

% % initiate model % if ~isstruct(model), model = initlssvm(model{:}); end

'OPTIMIZING GAMMA AND KERNEL PARAMETERS WITH BAYESIAN FRAMEWORK OVER ALL INPUTS...'

%model = changelssvm(model, 'kernel_type', 'RBF_kernel'); eval('[model,kernel_pars,bay] = bay_optimize(model,3,btype,nb);',... '[model,kernel_pars,bay] = bay_optimize(model,3,btype);');

costs(1) = bay.costL3;

% % init parameters % eval('nb;','nb=inf;'); xdim = model.x_dim; all = 1:xdim; reject = zeros(xdim,1);

% % continuous % if type(1)=='c',

if length(model.kernel_pars)~=model.x_dim, model = changelssvm(model,'kernel_pars',model.kernel_pars(1)*ones(1,model.x_dim)); end

end reject(xdim) = all; end

function [A,B,C,D,E] = bay_lssvm(model,level,type, nb, bay)
% Compute the posterior cost for the 3 levels in Bayesian inference
%
% >> cost = bay_lssvm({X,Y,type,gam,sig2}, level, type)
% >> cost = bay_lssvm(model , level, type)
%
% Description
% Estimate the posterior probabilities of model (hyper-) parameters
% on the different inference levels:
% - First level: In the first level one optimizes the support values alpha 's and the bias b.
% - Second level: In the second level one optimizes the regularization parameter gam.
% - Third level: In the third level one optimizes the kernel
% parameter. In the case of the common 'RBF_kernel' the kernel
% parameter is the bandwidth sig2.
%
% By taking the negative logarithm of the posterior and neglecting all constants, one
% obtains the corresponding cost. Computation is only feasible for
% one dimensional output regression and binary classification
% problems. Each level has its different in- and output syntax.
%
%
% Full syntax
%
% 1. Outputs on the first level
%
% >> [costL1,Ed,Ew,bay] = bay_lssvm({X,Y,type,gam,sig2,kernel,preprocess}, 1)
% >> [costL1,Ed,Ew,bay] = bay_lssvm(model, 1)
%
% costL1 : Cost proportional to the posterior
% Ed(*) : Cost of the fitting error term
% Ew(*) : Cost of the regularization parameter
% bay(*) : Object oriented representation of the results of the Bayesian inference
%
% 2. Outputs on the second level
%
% >> [costL2,DcostL2, optimal_cost, bay] = bay_lssvm({X,Y,type,gam,sig2,kernel,preprocess}, 2)
% >> [costL2,DcostL2, optimal_cost, bay] = bay_lssvm(model, 2)
%
% costL2 : Cost proportional to the posterior on the second level
% DcostL2(*) : Derivative of the cost
% optimal_cost(*) : Optimality of the regularization parameter (optimal = 0)
% bay(*) : Object oriented representation of the results of the Bayesian inference
%
% 3. Outputs on the third level
%
% >> [costL3,bay] = bay_lssvm({X,Y,type,gam,sig2,kernel,preprocess}, 3)
% >> [costL3,bay] = bay_lssvm(model, 3)
%
% costL3 : Cost proportional to the posterior on the third level
% bay(*) : Object oriented representation of the results of the Bayesian inference
%
% 4. Inputs using the functional interface
%
% >> bay_lssvm({X,Y,type,gam,sig2,kernel,preprocess}, level)
% >> bay_lssvm({X,Y,type,gam,sig2,kernel,preprocess}, level, type)
% >> bay_lssvm({X,Y,type,gam,sig2,kernel,preprocess}, level, type, nb)
%
% X : N x d matrix with the inputs of the training data
% Y : N x 1 vector with the outputs of the training data
% type : 'function estimation' ('f') or 'classifier' ('c')
% gam : Regularization parameter
% sig2 : Kernel parameter (bandwidth in the case of the 'RBF_kernel')
% kernel(*) : Kernel type (by default 'RBF_kernel')
% preprocess(*) : 'preprocess'(*) or 'original'
% level : 1, 2, 3
% type(*) : 'svd'(*), 'eig', 'eigs', 'eign'
% nb(*) : Number of eigenvalues/eigenvectors used in the eigenvalue decomposition approximation
%
% 5. Inputs using the object oriented interface
%
% >> bay_lssvm(model, level, type, nb)
%
% model : Object oriented representation of the LS-SVM model
% level : 1, 2, 3
% type(*) : 'svd'(*), 'eig', 'eigs', 'eign'
% nb(*) : Number of eigenvalues/eigenvectors used in the eigenvalue decomposition approximation
%
%
% See also:
% bay_lssvmARD, bay_optimize, bay_modoutClass, bay_errorbar

%
% initiate and ev. preprocess
%
if ~isstruct(model), model = initlssvm(model{:}); end
model = prelssvm(model);
if model.y_dim>1,
error(['Bayesian framework restricted to 1 dimensional regression' ...
' and binary classification tasks']);
end

%
% train with the matlab routines
%model = adaptlssvm(model,'implementation','MATLAB');

eval('nb;','nb=ceil(sqrt(model.nb_data));');

if ~(level==1 | level==2 | level==3),
error('level must be 1, 2 or 3.');
end

%
% delegate functions
%
if level==1,

eval('type;','type=''train'';');
%[cost, ED, EW, bay, model] = lssvm_bayL1(model, type);
eval('[A,B,C,D,E] = lssvm_bayL1(model,type,nb,bay);','[A,B,C,D,E] = lssvm_bayL1(model,type,nb);');

elseif level==2,

% default type
eval('type;','type=''svd'';');
%[costL2, DcostL2, optimal, bay, model] = lssvm_bayL2(model, type);

eval('[A,B,C,D,E] = lssvm_bayL2(model,type,nb,bay);',...
'[A,B,C,D,E] = lssvm_bayL2(model,type,nb);')

elseif level==3,

% default type
eval('type;','type=''svd'';');
%[cost, bay, model] = lssvm_bayL3(model, bay);
[A,B,C] = lssvm_bayL3(model,type,nb);

end

%
%
%
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% FIRST LEVEL %
% %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
function [cost, Ed, Ew, bay, model] = lssvm_bayL1(model, type, nb, bay)
%
% [Ed, Ew, cost,model] = lssvm_bayL1(model)
% [bay,model] = lssvm_bayL1(model)
%
% type = 'retrain', 'train', 'svd'
%
%

N = model.nb_data;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% compute Ed, Ew en costL1 based on training solution %
% TvG, Financial Timeseries Prediction using LS-SVM, 27-28 %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

if (type(1)=='t'), % train
% find solution of ls-svm
model = trainlssvm(model);
% prior %
if model.type(1) == 'f',
Ew = .5*sum(model.alpha.* (model.ytrain(1:model.nb_data,:) - model.alpha./model.gam - model.b));
elseif model.type(1) == 'c',
Ew = .5*sum(model.alpha.*model.ytrain(1:model.nb_data,:).* ...
((1-model.alpha./model.gam)./model.ytrain(1:model.nb_data,:) - model.b));
end

% likelihood
Ed = .5.*sum((model.alpha./model.gam).^2);

% posterior
cost = Ew+model.gam*Ed;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% compute Ed, Ew en costL1 based on SVD or nystrom %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
else

if nargin<4,
[bay.eigvals, bay.scores, ~, omega_r] = kpca(model.xtrain(model.selector,1:model.x_dim), ...
model.kernel_type, model.kernel_pars, [],type,nb,'original');

bay.eigvals = bay.eigvals.*(N-1);
bay.tol = 1000*eps;
bay.Peff = find(bay.eigvals>bay.tol);
bay.Neff = length(bay.Peff);
bay.eigvals = bay.eigvals(bay.Peff);
bay.scores = bay.scores(:,bay.Peff);
%Zc = eye(N)-ones(model.nb_data)/model.nb_data;


%disp('rescaling the scores');
for i=1:bay.Neff,
bay.Rscores(:,i) = bay.scores(:,i)./sqrt(bay.scores(:,i)'*bay.eigvals(i)*bay.scores(:,i));
end
end
Y = model.ytrain(model.selector,1:model.y_dim);

%%% Ew %%%%
% (TvG: 4.75 - 5.73))
YTM = (Y'-mean(Y))*bay.scores;
Ew = .5*(YTM*diag(bay.eigvals)*diag((bay.eigvals+1./model.gam).^-2))*YTM';

%%% cost %%%
YTM = (Y'-mean(Y));
%if model.type(1) == 'c', % 'classification' (TvG: 5.74)
% cost = .5*YTM*[diag(bay.eigvals); zeros(model.nb_data-bay.Neff,bay.Neff)]*diag((bay.eigvals+1./model.gam).^-1)*bay.scores'*YTM';
%elseif model.type(1) == 'f', % 'function estimation' % (TvG: 4.76)
           % + correctie of zero eignwaardes
cost = .5*(YTM*model.gam*YTM')-.5*YTM*bay.scores*diag((1+1./(model.gam.*bay.eigvals)).^-1*model.gam)*bay.scores'*YTM';
%end

%%% Ed %%%
Ed = (cost-Ew)/model.gam;

end

bay.costL1 = cost;
bay.Ew = Ew;
bay.Ed = Ed;
bay.mu = (N-1)/(2*bay.costL1);
bay.zeta = model.gam*bay.mu;

% SECOND LEVEL
%
%
function [costL2, DcostL2, optimal, bay, model] = lssvm_bayL2(model,type,nb,bay)
%
%
%

if ~(strcmpi(type,'eig') | strcmpi(type,'eigs')| strcmpi(type,'svd')| strcmpi(type,'eign')),
error('The used type needs to be ''svd'', ''eigs'' or ''eign''.')
end

N = model.nb_data;
% bayesian interference level 1

eval('[cost, Ed, Ew, bay, model] = bay_lssvm(model,1,type,nb,bay); ',...
'[cost, Ed, Ew, bay, model] = bay_lssvm(model,1,type,nb);');

all_eigvals = zeros(N,1); all_eigvals(bay.Peff) = bay.eigvals;

% Number of effective parameters
bay.Geff = 1 + sum(model.gam.*all_eigvals ./(1+model.gam.*all_eigvals));
bay.mu = .5*(bay.Geff-1)/(bay.Ew);
bay.zeta = .5*(N-bay.Geff)/bay.Ed;
% ideally: bay.zeta = model.gam*bay.mu;

% log posterior (TvG: 4.73 - 5.71)
costL2 = sum(log(all_eigvals+1./model.gam)) + (N-1).*log(bay.Ew+model.gam*bay.Ed);

% gradient (TvG: 4.74 - 5.72)
DcostL2 = -sum(1./(all_eigvals.*(model.gam.^2)+model.gam)) ...
+ (N-1)*(bay.Ed/(bay.Ew+model.gam*bay.Ed));

% endcondition fullfilled if optimal == 0;
optimal = model.gam - (N-bay.Geff)/(bay.Geff-1) * bay.Ew/bay.Ed;

% update structure bay
bay.optimal = optimal;
bay.costL2 = costL2;
bay.DcostL2 = DcostL2;



% THIRD LEVEL
%
%
function [costL3, bay, model] = lssvm_bayL3(model,type,nb)
%
% costL3 = lssvm_bayL3(model, type)
%

if ~(strcmpi(type,'svd') | strcmpi(type,'eigs') | strcmpi(type,'eign')),
error('The used type needs to be ''svd'', ''eigs'' or ''eign''.')
end

% lower inference levels;
[model,~, bay] = bay_optimize(model,2,type,nb);

% test Neff << N
N = model.nb_data;
if sqrt(N)>bay.Neff,
%model.kernel_pars
%model.gam
warning on;
warning(['Number of degrees of freedom not tiny with respect to' ...
' the number of datapoints. The approximation is not very good.']);
warning off
end

% construct all eigenvalues
all_eigvals = zeros(N,1);
all_eigvals(bay.Peff) = bay.eigvals;

% L3 cost function
%costL3 = sqrt(bay.mu^bay.Neff*bay.zeta^(N-1)./((bay.Geff-1)*(N-bay.Geff)*prod(bay.mu+bay.zeta.*all_eigvals)));
%costL3 = .5*bay.costL2 - log(sqrt(2/(bay.Geff-1))) - log(sqrt(2/(N-bay.Geff)))
costL3 = -(bay.Neff*log(bay.mu) + (N-1)*log(bay.zeta)...
- log(bay.Geff-1) -log(N-bay.Geff) - sum(log(bay.mu+bay.zeta.*all_eigvals)));
bay.costL3 = costL3;

warning on