Massive time required for pdist
    6 views (last 30 days)
  
       Show older comments
    
    Sebastian Stumpf
 on 4 Oct 2021
  
    
    
    
    
    Commented: Sebastian Stumpf
 on 6 Oct 2021
            Hello,
I am using the Matlab function pdist to calculate the distance between two points. However, I noticed that the function needs a lot of time, despite it is using all four cores. I build this example to demonstrate the massive time comsumption. If I calculate the distance between two points with my own code, it is much faster. The example calculates the distance between a thousand points.
clear
close all
clc
tic
j=1;
X = rand(1000,2);
Y = rand(1000,2);
fprintf('Time for array creation: ');
toc
tic
for i = 1:1:size(Y,1) 
    for k = 1:1:size(X,1)
    A(j,1) =sqrt((Y(i,1)-X(k,1))^2 + (Y(i,2)-X(k,2))^2);
    j = j+1;
    end
end
fprintf('Time for own distance calculation: ');
toc
j = 1;
tic
for i = 1:1:size(Y,1) 
    for k = 1:1:size(X,1)
    P = [Y(i,1),Y(i,2);X(k,1),X(k,2)];    
    B(j,1) = pdist(P,'euclidean');
    j = j+1;
    end
end
fprintf('Time for distance calculation using Matlab function pdist: ');
toc
Output:
Time for array creation: Elapsed time is 0.000386 seconds.
Time for own distance calculation: Elapsed time is 0.251026 seconds.
Time for distance calculation using Matlab function pdist: Elapsed time is 10.776532 seconds.
You can clearly see, that the Matlab function pdist takes over 10 seconds longer. 
My question is: Why? What else is this function doing?
Would be nice to know.
Thank you very much
Kind regards,
Sebastian
0 Comments
Accepted Answer
  Chunru
      
      
 on 4 Oct 2021
        
      Edited: Chunru
      
      
 on 4 Oct 2021
  
      %tic
X = rand(1000,2);
Y = rand(1000,2);
% fprintf('Time for array creation: ');
%toc
%% Version 1
tic
j=1;
for i = 1:1:size(Y,1) 
    for k = 1:1:size(X,1)
        A(j,1) =sqrt((Y(i,1)-X(k,1))^2 + (Y(i,2)-X(k,2))^2);
        j = j+1;
    end
end
size(A)
t = toc;
fprintf('Time for own distance calculation: %.6f\n', t);
%% Version 1.1
% Pre-allocate A 
tic
j=1;
A = inf(size(X,1)*size(Y,1), 1);
for i = 1:1:size(Y,1) 
    for k = 1:1:size(X,1)
        A(j,1) =sqrt((Y(i,1)-X(k,1))^2 + (Y(i,2)-X(k,2))^2);
        j = j+1;
    end
end
size(A)
t = toc;
fprintf('Time for own distance calculation with preallocation:  %.6f\n', t);
%% Version 2
tic
j=1;
for i = 1:1:size(Y,1) 
    for k = 1:1:size(X,1)
        P = [Y(i,1),Y(i,2);X(k,1),X(k,2)];    
        B(j,1) = pdist(P,'euclidean');  % one pair
        j = j+1;
    end
end
size(B)
t = toc;
fprintf('Time for distance calculation using Matlab function pdist:  %.6f\n', t);
%% Version 2.1
% Pre-allocate B before hand
tic
j=1;
B = inf(size(X,1)*size(Y,1), 1);
for i = 1:1:size(Y,1) 
    for k = 1:1:size(X,1)
        P = [Y(i,1),Y(i,2);X(k,1),X(k,2)];    
        B(j,1) = pdist(P,'euclidean');
        j = j+1;
    end
end
size(B)
t = toc;
fprintf('Time for distance calculation using Matlab function pdist:  %.6f\n', t);
%% Version 3
% pdist of many points (this compute distance x2-x1, x3-x1, ... x1000-x1,
% y1-x1, ..., y10001;  x3-x2, ..., x1000-x2, ..., y1000-x2 etc
% doc pdist
tic
p = pdist([X; Y]);       % dist
size(p)
t = toc;
fprintf('Time for distance calculation using Matlab function pdist (many points):  %.6f\n', t);
More Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
