chi squared dist for histograms: vectorized vs forloops

1 view (last 30 days)
Dear all, I was looking at the chi squared distance from histograms used by many and written by Peter Kovesi. Here it is
function [D] = chisq_forloop(X,Y)
%%%supposedly it's possible to implement this without a loop!
m = size(X,1); n = size(Y,1);
mOnes = ones(1,m); D = zeros(m,n);
for i=1:n
yi = Y(i,:); yiRep = yi( mOnes, : );
s = yiRep + X; d = yiRep - X;
D(:,i) = sum( d.^2 ./ (s+eps), 2 );
end
D = D/2;
end
So I went ahead and implemented it without a forloop. Here it is
function [D] = chisq_vec(X,Y)
%There is a block of m rows per j, for instance j = 2 goes from m+1 to 2m
%There are n blocks total, and in block j, the ith row corresponds to X
%For instance D(i,j) will be in the ith row of block j
m = size(X,1);
n = size(Y,1);
Xrep = kron(X,ones(n,1));
Yrep = repmat(Y,m,1);
chi = sum(((Xrep - Yrep).^2)./(Xrep + Yrep + eps),2)/2;
D = vec2mat(chi,n);
end
But to my surprise, it seems that the vectorized version is slower, in particular most of the computation time seems to be taken by chi = sum(((Xrep - Yrep).^2)./(Xrep + Yrep + eps),2)/2; The timing code is here:
clc
clear
bins = 30;
nb_x = 100;
nb_y = 99;
X = rand(nb_x,bins)+eps;
Y = rand(nb_y,bins)+eps;
D_loop = chisq_forloop(X,Y);
D_vec = chisq_vec(X,Y);
isequal(D_loop,D_vec)
time_vec = zeros(20,1);
time_loop = zeros(20,1);
f_loop = @() chisq_forloop(X,Y);
f_vec = @() chisq_vec(X,Y);
for i = 1:20
time_vec(i) = timeit(f_vec);
time_loop(i) = timeit(f_loop);
end
figure
hold on
plot(1:20,time_vec,'b');
plot(1:20,time_loop,'r');
But the vectorized version seems much slower (4x)!!! Is this normal, anyone any ideas to salvage this? Many use bsxfun or gpu computing somehow? Although I have no experience with them.

Answers (0)

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!