我正在使用以下函数计算两个正态双变量分布的重叠
function [ oa ] = bivariate_overlap_integral(mu_x1,mu_y1,mu_x2,mu_y2)
%calculating pdf. Using x as vector because of MATLAB requirements for integration
bpdf_vec1=@(x,y,mu_x,mu_y)(exp(-((x-mu_x).^2)./2.-((y-mu_y)^2)/2)./(2*pi));
%calcualting overlap of two distributions at the point x,y
overlap_point = @(x,y) min(bpdf_vec1(x,y,mu_x1,mu_y1),bpdf_vec1(x,y,mu_x2,mu_y2));
%calculating overall overlap area
oa=dblquad(overlap_point,-100,100,-100,100);
您可以看到这涉及从函数 overlay_point 中取一个双积分(x:-100 到 100,y:-100 到 100,理想情况下是 -inf 到 inf 但现在就足够了),该函数的最小值为 2 pdf-s,由下式给出x,y 点的两个分布的函数 bpdf_vec1。
现在,PDF 永远不会为 0,所以我希望区间的面积越大,最终结果就会越大,显然在某个点之后差异可以忽略不计。但是,有时,当我减小间隔的大小时,结果似乎会增加。例如:
>> mu_x1=0;mu_y1=0;mu_x2=5;mu_y2=0;
>> bpdf_vec1=@(x,y,mu_x,mu_y)(exp(-((x-mu_x).^2)./2.-((y-mu_y)^2)/2)./(2*pi));
>> overlap_point = @(x,y) min(bpdf_vec1(x,y,mu_x1,mu_y1),bpdf_vec1(x,y,mu_x2,mu_y2));
>> dblquad(overlap_point,-10,10,-10,10)
ans =
0.0124
>> dblquad(overlap_point,-100,100,-100,100)
ans =
1.4976e-005 -----> strange, as theoretically cannot be smaller then the first answer
>> dblquad(overlap_point,-3,3,-3,3)
ans =
0.0110 -----> makes sense that the result is less than the first answer as the
interval is decreased
在这里,我们可以检查重叠在间隔的边界点处(接近)为 0。
>> overlap_point (100,100)
ans =
0
>> overlap_point (-100,100)
ans =
0
>> overlap_point (-100,-100)
ans =
0
>> overlap_point (100,-100)
ans =
0
这可能与 dblquad 的实现有关,还是我在某处犯了错误?我使用 MATLAB R2011a。
谢谢