continuous search of the closest rows in a matrix

Hello,
Consider a 100x10 matrix and a random row of it, let be "row1".
I want to calculate the Euclidean distance between "row1" and the rest of the rows and I want to find the closest row to "row1", let be "row2".
Then I want to find the closest row to "row2", let be "row3" (the "row1" has been excluded from the matrix) and so on.
I use the "pdist2" for the Euclidean distance.
How you please help me?
Thank you.
Best,
Pavlos

4 Comments

So what have you tried so far? Why isn't pdist2() and a for-loop working?
I use the pidst2() but any other recommendation is welcomed. I am new in matlab and I tried for-loop and I had not a success.
Any help is appreciated it. Thank you.
Pavlos
Please post what you have done so we can help you accordingly.
Here is the code:
%extract a random row from the matrix M 100x10
T = randperm(100);
row1 = M(T(1:1),:);
%remove the row from the matrix
M(T(1),:)=[];
%calculate the distances between M and row1
d1=pdist2(M,row1);
%find the closest row to row1 and the corresponding distance between them
[min_dis ind_min_dis]=min(d1);
row2=T(ind_min_dis,:);
%remove row2 from the matrix
M(ind_min_dis,:)=[];
%continue the same process with a loop
for k=1:length(M)
d(k)=pdist2(M,row(k));
[min_dis(k) ind_min_dis(k)]=min(d(k));
row(k)=M(ind_min_dis(k),:);
M(ind_min_dis(k),:)=[];
end

Sign in to comment.

 Accepted Answer

Matt J
Matt J on 15 Jan 2013
Edited: Matt J on 15 Jan 2013
Here's what I use. There are others like it on the FEX, some of them fancier.
function Graph=interdists(A,B)
%Finds the graph of distances between point coordinates
%
% (1) Graph=interdists(A,B)
%
% in:
%
% A: matrix whose columns are coordinates of points, for example
% [[x1;y1;z1], [x2;y2;z2] ,..., [xM;yM;zM]]
% but the columns may be points in a space of any dimension, not just 3D.
%
% B: A second matrix whose columns are coordinates of points in the same
% Euclidean space. Default B=A.
%
%
% out:
%
% Graph: The MxN matrix of separation distances in l2 norm between the coordinates.
% Namely, Graph(i,j) will be the distance between A(:,i) and B(:,j).
%
%
% (2) interdists(A,'noself') is the same as interdists(A), except the output
% diagonals will be NaN instead of zero. Hence, for example, operations
% like min(interdists(A,'noself')) will ignore self-distances.
%
% See also getgraph
noself=false;
if nargin<2
B=A;
elseif ischar(B)&&strcmpi(B,'noself')
noself=true;
B=A;
end
N=size(A,1);
B=reshape(B,N,1,[]);
Graph=l2norm(bsxfun(@minus, A, B),1);
Graph=squeeze(Graph);
if noself
n=length(Graph);
Graph(linspace(1,n^2,n))=nan;

7 Comments

Matt thank you very much.
My Matrix is 100x10 and I would like to calculate the distances between the rows, so I suppose the output will be a 100x100 matrix.
Your code returns a 10x10 matrix.
Pavlos
Transpose your matrix so that it is in the form expected by the code and described in the code's help doc.
Yes Matt that worked.
Ok, now I can calculate the distances between the rows.
As I described earlier, the next step is to find the rows that are the closest.
Suppose that the 100x100 matrix N contains the distances between the rows of the 100x10 M.
Then with
N(~N)=nan;
[min_value min_ind]=min(N);
I can find the indexes of the rows.
This is the tricky part.
Let the closest row to the 1st is the 45th.
Which is the closest to the 45th excluding the 1st?
Actually, its an iteration: you pick a row and find the closest. Then you find the closest to the latter and so on.
Any ideas?
Thank you.
Pavlos
If you do,
N(bsxfun(@le,(1:100).',1:100))=nan;
[~,i]=min(N,[],1);
then i(j) will be the closest to j excluding k<=j.
Hello again,
I get the
Error: Unbalanced or unexpected parenthesis or bracket.
My mistake.
It works.
Thank you very much.
Please accept an answer if it helped you.

Sign in to comment.

More Answers (1)

If you have the Statistics Toolbox installed, instead of using PDIST2, use PDIST (along with SQUAREFORM) instead. It is designed to find all interpoint distances for a single matrix very efficiently. Then your entire problem could be reduced to this:
M = randn(100,10);
L = size(R,1);
D = squareform(pdist(M));
D(1:L+1:L^2) = nan;
startrow = 1; % Starting Row
row = [startrow; zeros(L-1,1)];
for n = 2:L
oldrow = startrow;
[val,startrow] = min(D(:,startrow));
D(oldrow,:) = nan;
row(n) = startrow;
end
row

Categories

Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange

Asked:

on 15 Jan 2013

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!