How to handle large matrices
Show older comments
I have a general problem, when I want to work with large matrices. The calculation time of my script is around 2 weeks - far too much. I am sure, that the solution is quite simple for someone, who is more experienced with Matlab programming.
Problem with 2 matrices:
pc_raw = 23617489x12 double
fw_raw = 35631184x5 double
Now I want to find for each value in pc_raw(:,1) the corresponding value in fw_raw(:,1) and copy fw_raw(corr_value, 1:5) to pc_raw(corr_value, 8:12).
Example:
pc_raw = 1 1 3 3 4 5 6 7 8 9 10
fw_raw = 1 1 1 2 2 3 3 4 4 5 6 7 8 8 9 10
My script so far:
for k=1:len_pc_raw
pc_timestep = pc_raw(k,1);
NaN_test = isnan(pc_raw(k,8));
if NaN_test == 1
pc_temp = find(pc_raw(:,1)==pc_timestep);
fw_temp = find(fw_raw(:,1)==pc_timestep);
empty_test = isempty(fw_temp);
if empty_test == 0
len_pc_temp = length(pc_temp);
len_fw_temp = length(fw_temp);
if len_pc_temp <= len_fw_temp
for j=1:len_pc_temp
pc_echo = pc_temp(j); fw_echo = fw_temp(j);
pc_raw(pc_echo,8:12) = fw_raw(fw_echo,1:5);
end
else
for j=1:len_fw_temp
pc_echo = pc_temp(j); fw_echo = fw_temp(j);
pc_raw(pc_echo,8:12) = fw_raw(fw_echo,1:5);
end
end
end
end
end
Any idea how to speed up the calculation time? I would be very grateful for any comments!
7 Comments
Oleg Komarov
on 16 Feb 2012
Yur loop is not consistent. What if pc_temp has more entries than fw_temp?
Reik
on 16 Feb 2012
James Tursa
on 16 Feb 2012
Are pc_raw(:,1) and fw_temp(:,1) always pre-sorted in ascending order as you have shown?
Reik
on 16 Feb 2012
James Tursa
on 16 Feb 2012
That will help a *lot*.
James Tursa
on 16 Feb 2012
What do you want to have happen in the following cases:
1) Value in pc_raw(:,1) is not found in fw_raw(1,:)
2) Value in pc_raw(:,1) has multiple matches in fw_raw(1,:)
3) Multiple same values in pc_raw(:,1)
Once I know these answers I can post m-code and/or a mex routine that should work pretty fast.
Reik
on 17 Feb 2012
Accepted Answer
More Answers (1)
James Tursa
on 17 Feb 2012
Here is some m-code for what I think you want:
m1 = size(pc_raw,1);
m2 = size(fw_raw,1);
k2 = 1;
for k1=1:m1
if( isnan(pc_raw(k1,8)) )
while( pc_raw(k1,1) > fw_raw(k2,1) )
k2 = k2 + 1;
if( k2 > m2 )
break;
end
end
if( k2 > m2 )
break;
end
if( pc_raw(k1,1) == fw_raw(k2,1) )
pc_raw(k1,8:12) = fw_raw(k2,1:5);
k2 = k2 + 1;
if( k2 > m2 )
break;
end
end
end
end
It relies heavily on the fact that the first column of each array is pre-sorted in ascending order as you stated.
And here is a mex implementation (CAUTION: No argument checking)
EDIT: 17-Feb-2012 changed location of mxUnshareArray
/* findrep(pc_raw,fw_raw); */
#include "mex.h"
#ifndef MWSIZE_MAX
#define mwSize int
#endif
void mxUnshareArray(mxArray *);
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mwSize m1, m2, k1, k2;
double *Apr1, *Apr8, *Apr9, *Apr10, *Apr11, *Apr12;
double *Bpr1, *Bpr2, *Bpr3, *Bpr4, *Bpr5;
mxUnshareArray(prhs[0]);
m1 = mxGetM(prhs[0]);
m2 = mxGetM(prhs[1]);
Apr1 = mxGetPr(prhs[0]);
Bpr1 = mxGetPr(prhs[1]);
Apr8 = Apr1 + m1*7;
Apr9 = Apr8 + m1;
Apr10 = Apr9 + m1;
Apr11 = Apr10 + m1;
Apr12 = Apr11 + m1;
Bpr2 = Bpr1 + m2;
Bpr3 = Bpr2 + m2;
Bpr4 = Bpr3 + m2;
Bpr5 = Bpr4 + m2;
k2 = 0;
for( k1=0; k1<m1; k1++ ) {
if( mxIsNaN(Apr8[k1]) ) {
while( Apr1[k1] > Bpr1[k2] ) {
if( ++k2 == m2 ) {
return;
}
}
if( Apr1[k1] == Bpr1[k2] ) {
Apr8[k1] = Bpr1[k2];
Apr9[k1] = Bpr2[k2];
Apr10[k1] = Bpr3[k2];
Apr11[k1] = Bpr4[k2];
Apr12[k1] = Bpr5[k2];
if( ++k2 == m2 ) {
return;
}
}
}
}
}
To use the mex code, put it in a file on the MATLAB path, e.g. findrep.c, and then do this:
mex findrep.c
Call it without any output arguments, e.g.,
findrep(pc_raw,fw_raw);
Categories
Find more on Performance and Memory in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!