How to remove dependent rows in a matrix?

Let A be an m by n matrix whose rows are linearly dependent. I want to remove rows from A such that the rank does not decrease. How can I find such rows of A?

 Accepted Answer

function [Xsub,idx]=licols(X,tol)
%Extract a linearly independent set of columns of a given matrix X
%
% [Xsub,idx]=licols(X)
%
%in:
%
% X: The given input matrix
% tol: A rank estimation tolerance. Default=1e-10
%
%out:
%
% Xsub: The extracted columns of X
% idx: The indices (into X) of the extracted columns
if ~nnz(X) %X has no non-zeros and hence no independent columns
Xsub=[]; idx=[];
return
end
if nargin<2, tol=1e-10; end
[Q, R, E] = qr(X,0);
if ~isvector(R)
diagr = abs(diag(R));
else
diagr = R(1);
end
%Rank estimation
r = find(diagr >= tol*diagr(1), 1, 'last'); %rank estimation
idx=sort(E(1:r));
Xsub=X(:,idx);

11 Comments

Thanks a lot. It works fine.
Matt, I'm a little unclear as to how the rank estimation tolerance is working there. Could you explain it please? It looks like it works differently to the estimation tolerance in the rank function.
Hi Harry,
When you call QR with the syntax shown, diagr = abs( diag( R ) ) will be sorted with descending elements and the number of trailing zeros it shows gives you the nullity of X. To decide whether an element is small enough to be considered a trailing zero, we do
r = find(diagr >= tol*diagr(1), 1, 'last');
which compares all of diagr's elements to its maximum element diagr(1). Note that this threshold selection method is adaptive to the magnitude of elements in X.
The rank() function is a bit naive, in that it doesn't scale its threshold tolerance relative to the magnitude of the X(i,j). That's the reason for not using it here.
Now that I look at it, though, there is a small bug
diagr = R(1);
should really be
diagr = abs(R(1));
Hi Matt, Thanks a lot for that. Sure, the rank function doesn't scale. But also it applies the tolerance to a vector of singular values calculated using svd rather than to the leading diagonal of the R-matrix. Can you explain the relationship between the two?
I don't claim there is a relationship, other than that both the SVD and QR decompositions of a matrix give information about its rank, and that both work by rewriting the columns of the matrix as a linear combination of orthogonal basis vectors.
You could do something similar to what I've done via QR with SVD instead, but with a more robust thresholding rule than what rank() currently does, e.g.,
s=svd(X);
rank=nnz(s>=max(s)*tol);
SVD seems to be more computationally demanding than QR, however,
>> X=rand(1000); tic;[q,r,e]=qr(X,0);toc; tic;s=svd(X);toc
Elapsed time is 0.275871 seconds.
Elapsed time is 0.380738 seconds.
Hi Matt J
I have a 398*225 matrix and it has rank 225. I used upper function to remove some raw without decreasing rank . but lincols function returns a 398*160 matrix that has rank 160.
How can I fix it ?
thanks
Matt J
Matt J on 27 Jun 2014
Edited: Matt J on 27 Jun 2014
Hi mohsen,
I recommend that you attach your matrix in a .mat file, so that it can be studied.
I am really grateful for your help
mohsen,
When I do plot(svd(docoeff)), I get the plot below, which to my eye says that the rank of your matrix is well below 200.
Nevertheless, if you do consider such tiny singular values significant, you can set the rank threshold lower
>> X=licols(docoeff,1e-15); whos X
Name Size Bytes Class Attributes
X 398x225 716400 double
I calculate rank with Matlab rank() function. it says the rank is 225
I must decrease raws from 398 to 261 without decreasing rnak,and you said the licols function removes raws ,but it removes columns.
Matlab's rank() function is not to be trusted blindly (as you can see from my previous plot). If nothing else, rank is subjectively dependent on the tolerance parameter that you use, just like I showed you that licols is. You chose to use the default tolerance, but a different choice would give you a different result, e.g.,
>> rank(docoeff,2)
ans =
203
I did not say that licols() removes rows. The help text clearly says that it removes columns. However, you can certainly use it to remove rows by transposing:
>> X=licols(docoeff.').'; whos X
Name Size Bytes Class Attributes
X 160x225 288000 double

Sign in to comment.

More Answers (2)

Another, very straightforward, approach is to include them one by one and observe the changes in rank … (I agree that this is not so elegant!).
N = size(A,1) ; % number of rows
IncludeTF = false(N,1) ; % by default, exclude all rows, except ...
IncludeTF(1) = true ; % first row which can always be included
R0 = rank(A) ; % the original rank
for k = 2:N, % loop over all rows
B = A(IncludeTF,:) ; % select the currently included rows of A
IncludeTF(k) = rank(B) < R0 ; % include in B when the rank is less
end
isequal(rank(B), R0) % check!

1 Comment

I have the same question with gf matrix.Does this code work for gf(galois field) matrix?

Sign in to comment.

This is a shorter version of Jos solution if you needed:
R1=1;
for I=1:size(A,1)
R2=rank(A(1:I,:));
if R2~=R1; disp(I); end
R1=R2+1;
end

1 Comment

It displays the rows with linear dependany to other rows.

Sign in to comment.

Categories

Asked:

on 5 Oct 2012

Commented:

on 24 Aug 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!