5
function [ d ] = hcompare_KL( h1,h2 )
%This routine evaluates the Kullback-Leibler (KL) distance between histograms. 
%             Input:      h1, h2 - histograms
%             Output:    d – the distance between the histograms.
%             Method:    KL is defined as: 
%             Note, KL is not symmetric, so compute both sides.
%             Take care not to divide by zero or log zero: disregard entries of the sum      for which with H2(i) == 0.

temp = sum(h1 .* log(h1 ./ h2));
temp( isinf(temp) ) = 0; % this resloves where h1(i) == 0 
d1 = sum(temp);

temp = sum(h2 .* log(h2 ./ h1)); % other direction of compare since it's not symetric
temp( isinf(temp) ) = 0;
d2 = sum(temp);

d = d1 + d2;

end

我的问题是,每当 h1(i) 或 h2(i) == 0 时,我都会得到预期的 inf。但是在 KL 距离中,我假设每当他们 h1 或 h2 ==0 时返回 0 我怎么能在不使用循环的情况下做到这一点?

4

3 回答 3

3

为避免在任何计数为 0 时出现问题,我建议您创建一个标记“好”数据点的索引:

%# you may want to do some input testing, such as whether h1 and h2 are
%# of the same size

%# preassign the output
d = zeros(size(h1));

%# create an index of the "good" data points
goodIdx = h1>0 & h2>0; %# bin counts <0 are not good, either

d1 = sum(h1(goodIdx) .* log(h1(goodIdx) . /h2(goodIdx)));
d2 = sum(h2(goodIdx) .* log(h2(goodIdx) . /h1(goodIdx)));

%# overwrite d only where we have actual data
%# the rest remains zero
d(goodIdx) = d1 + d2;
于 2012-11-14T15:32:08.027 回答
2

我在您的实施中看到了一些错误。请按 log2 编辑日志

于 2014-05-20T10:22:12.373 回答
1

尝试使用

 d=sum(h1.*log2(h1+eps)-h1.*log2(h2+eps))

请注意,KL(h1,h2) 与 KL(h2,h1) 不同。你的情况是KL(h1,h2),对吗?我认为你的实现是错误的。它不是 h1 和 h2 之间的距离。定义了 h1 和 h2 之间的 KL 距离

KL(h1,h2)=sum(h1.log(h1/h2))=sum(h1.logh1-h2.logh2). 

所以正确的实现必须是

 d=sum(h1.*log2(h1+eps)-h1.*log2(h2+eps)) %KL(h1,h2)

或者

 d=sum(h2.*log2(h2+eps)-h2.*log2(h1+eps)) %KL(h2,h1)
于 2014-08-01T09:40:14.957 回答