How can I make indexing faster, that is, searching for a groups of numbers within a matrix?
Show older comments
Hello!
I was wondering if there was anything faster than this:
X = ( Y1 < a & Y1 >= b & Y2 >= c & Y2 < d );
I run it in a loop a lot and it is by far taking up the most time. X is a group of numbers that meets 4 conditions. I am searching two large matrices, Y1 and Y2 for numbers that meets that condition. Anyone have an easier way?
I am on Matlab 2015a, 64bit, 120GB Ram, and blazing fast processors.
Thanks!
11 Comments
James Tursa
on 6 Jun 2015
Edited: James Tursa
on 6 Jun 2015
(1) I am confused. Is X supposed to be a logical array result of the conditions as you show above, or are you actually trying to get a subset of an existing X array based on the conditions as Walter shows in his answer?
(2) Do you have a C compiler available to generate mex routines? This could potentially give you a substantial speedup. (I am showing up to 80% timing reduction in tests I have run)
James Tursa
on 8 Jun 2015
To use the C mex routine I wrote would require a C compiler. Once compiled, it can be called just like any other regular MATLAB function (i.e., yes it runs inside of MATLAB). With a 41 hour run time, I would seriously look into getting a C/C++ compiler for this task. The C code itself is fairly simple and I could post it for you. It is easy to get the X and/or the Numbers(X) result passed back from the mex routine.
Joe
on 8 Jun 2015
Walter Roberson
on 8 Jun 2015
Note that in my code, sparse() is applied to one condition first, and that results in a sparse array that is then processed by & with the second condition, producing a sparse result that is then processed by &, and so on. With the way you have used sparse, you do not get the advantage of the sparse calculation in calculating the Y2 truth values: you calculate them in full and then do a sparse() & against the first result.
James Tursa
on 8 Jun 2015
Edited: James Tursa
on 9 Jun 2015
Do you want X returned from the mex routine, or are you just interested in the Numbers(X) result? I.e., the mex routine can be set up to either return X, or just use X internally but only return the Numbers(X) result. Currently what I have is this, which only returns the Numbers(X) result, but it could easily be modified to return X as well if you needed it for some reason. To create the mex routine, copy this code into a file called subset4a.c (or any_other_name.c) and then issue the following command:
mex subset4a.c
// Returns subset of an array based on relational results
// Syntax:
// Z = subset4a(Numbers,Y1,Y2,a,b,c,d);
//
// Returns equivalent of:
// Z = Numbers( Y1 < a & Y1 >= b & Y2 >= c & Y2 < d )
//
// Programmer: James Tursa
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
double a, b, c, d;
double *X, *Z, *Y1, *Y2;
size_t *x, *z;
size_t i, n, k, j;
if( nrhs != 7 ) {
mexErrMsgTxt("Need exactly 7 full real double inputs");
}
if( nlhs > 1 ) {
mexErrMsgTxt("Too many outputs");
}
for( i=0; i<7; i++ ) {
if( !mxIsDouble(prhs[i]) || mxIsComplex(prhs[i]) || mxIsSparse(prhs[i]) ) {
mexErrMsgTxt("Need exactly 7 full real double inputs");
}
}
n = mxGetNumberOfElements(prhs[0]);
if( mxGetNumberOfElements(prhs[1]) != n || mxGetNumberOfElements(prhs[2]) != n ) {
mexErrMsgTxt("1st three arguments must have same number of elements");
}
for( i=3; i<7; i++ ) {
if( mxGetNumberOfElements(prhs[i]) != 1 ) {
mexErrMsgTxt("Last four arguments must be scalars");
}
}
X = mxGetPr(prhs[0]);
Y1 = mxGetPr(prhs[1]);
Y2 = mxGetPr(prhs[2]);
a = mxGetScalar(prhs[3]);
b = mxGetScalar(prhs[4]);
c = mxGetScalar(prhs[5]);
d = mxGetScalar(prhs[6]);
z = x = mxMalloc(n*sizeof(*x)); // Work array to save indexes
for( i=0; i<n; i++ ) {
if( *Y1 < a && *Y1 >= b && *Y2 >= c && *Y2 < d ) {
*z++ = i; // Remember the index of the element to save
}
Y1++;
Y2++;
}
k = z - x; // Number of indexes we saved
plhs[0] = mxCreateDoubleMatrix( k, 1, mxREAL ); // EDIT FIX
if( k ) {
Z = mxGetPr(plhs[0]);
z = x;
for( i=0; i<k; i++ ) {
*Z++ = X[*z++]; // Copy the elements to the result
}
}
mxFree(x); // Free the work array
}
James Tursa
on 8 Jun 2015
P.S. I have a slightly faster version of this that wastes some memory if you are interested. Basically, instead of saving the indexes, it allocates a guessed size at the start and copies the elements as you go. When done, it simply sets the size of the result (the memory at the end of the array ends up being unused). If you can guess a good number up front for the max size of the result it might be a good way to go.
Joe
on 8 Jun 2015
Joe
on 8 Jun 2015
James Tursa
on 9 Jun 2015
Use the edited version.
Joe
on 15 Jun 2015
Accepted Answer
More Answers (0)
Categories
Find more on Matrix Indexing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!