Optimized code for loop, If-statement for large dataset

1 view (last 30 days)
I was hoping to delete some certain rows using condion. My data is in double format (790127*24) I approximate the total code need 25 hours using Run and Time which is huge. Is there any way of optimiing the script.
TIA...
n=0;
for i = 1 : length(d_A)
if any(isnan(d_A(i-n, 6))) ...
&& any(isnan(d_A(i-n, 7))) ...
&& any(isnan(d_A(i-n, 8))) ...
&& any(isnan(d_A(i-n, 9))) ...
&& any(isnan(d_A(i-n,10))) ...
&& any(isnan(d_A(i-n,11))) ...
&& any(isnan(d_A(i-n,12))) ...
&& any(isnan(d_A(i-n,13))) ...
&& any(isnan(d_A(i-n,14))) ...
&& any(isnan(d_A(i-n,15))) ...
&& any(isnan(d_A(i-n,16))) ...
&& any(isnan(d_A(i-n,17))) ...
&& any(isnan(d_A(i-n,18))) ...
&& any(isnan(d_A(i-n,19)))
d_A(i-n,:) = [];
n=n+1;
end
end

Accepted Answer

per isakson
per isakson on 25 May 2019
Edited: per isakson on 26 May 2019
Try this
%%
ixcol = [ 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19 ];
n=0;
for i = 1 : length(d_A)
if all( isnan( d_A( i-n, ixcol ))) % i-n is a scalar
d_A(i-n,:) = [];
n=n+1;
end
end
and this
%%
ixcol = [ 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19 ];
is_to_be_deleted = false( size(d_A,1), 1 );
n=0;
for i = 1 : length(d_A)
if all( isnan( d_A( i-n, ixcol ))) % i-n is a scalar
% d_A(i-n,:) = [];
is_to_be_deleted(i-n) = true;
n=n+1;
end
end
d_A( is_to_be_deleted, : ) = [];
Caveat: not tested
In response to comment:
Now it's possible to factor out the for-loop. Try this
%% Sample data
A = rand( [8,4] );
ixcol = [2,3];
A([3,5],ixcol) = nan;
A( randperm( numel(A), 9 ) ) = nan;
%%
[ A3, ix_deleted3 ] = cssm_3( A, ixcol );
[ A4, ix_deleted4 ] = cssm_4( A, ixcol );
ix_deleted3 == ix_deleted4 %#ok<NOPTS,EQEFF>
function [ A, ix_deleted ] = cssm_3( A, ixcol )
is_to_be_deleted = false( size(A,1), 1 );
for jj = 1 : length(A)
if all( isnan( A( jj, ixcol )))
is_to_be_deleted(jj) = true;
end
end
A( is_to_be_deleted, : ) = [];
ix_deleted = find( is_to_be_deleted );
end
function [ A, ix_deleted ] = cssm_4( A, ixcol )
is_to_be_deleted = all( isnan( A( :, ixcol ) ), 2 );
A( is_to_be_deleted, : ) = [];
ix_deleted = find( is_to_be_deleted );
end
it outputs
>> cssm
ans =
2×1 logical array
1
1
The vectorized version, cssm_4, might not improve performance significantly, but in my opinion it makes cleaner code.
  2 Comments
Jhon Gray
Jhon Gray on 25 May 2019
Edited: Jhon Gray on 25 May 2019
Wow. The second one is super fast.But there's a little bit problem here. The code would be like this.No need of i-n in this case.Thanks for helping.Take love.
ixcol = [ 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19 ];
is_to_be_deleted = false( size(d_A,1), 1 );
n=0;
for i = 1 : 1000 %length(d_A)
if all( isnan( d_A( i, ixcol ))) % i-n is a scalar
% d_A(i-n,:) = [];
is_to_be_deleted(i) = true;
%n=n+1;
end
end
d_A( is_to_be_deleted, : ) = [];
per isakson
per isakson on 26 May 2019
Edited: per isakson on 26 May 2019
I surmised that there was a problem and added the last line in bold.
It's as a bad for performance to remove one line at a time as adding one line at a time. In both cases the matrix is rewritten to memory in each operation.

Sign in to comment.

More Answers (0)

Categories

Find more on Matrices and Arrays in Help Center and File Exchange

Products


Release

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!