MATLAB Answers

MEX api and interleaved complex arrays

16 views (last 30 days)
Jan
Jan on 24 Oct 2020
Edited: Bruno Luong on 25 Oct 2020
Does MATLAB store complex data in interleaved format since R2018a?
If you compile a MEX function with the -R2018a flag, you can process complex arrays in interleaved format. But what happens, if a MEX function is compiled with the -R2017b flag and called from a modern Matlab version? Does accessing mxGetPr / mxGetPi cause a deep data copy to create the two arrays in separate memory blocks, and another copy to re-create an interleaved array for the output?
This would mean, that MEX fumnctions processing complex data are very inefficient, when they use the -R2017b flag. Is this correct?

  0 Comments

Sign in to comment.

Accepted Answer

Bruno Luong
Bruno Luong on 24 Oct 2020
Edited: Bruno Luong on 24 Oct 2020
Where have you been Jan? In a cave? ;-)
Sorry it couldn't resist.
As I understand, when you call mxGetPr and mxGetPi, the data is copied to ensure the compatibility. Yes it will not optimal, but it's not very inefficient either.
Yes, to avoid the data copy you have to compile MCC using -R2018a flag then use function such as mxGetDoubles, etc rather than mxGetPr
You can always use the predefined symbol MX_HAS_INTERLEAVED_COMPLEX to know which option you compile
#if MX_HAS_INTERLEAVED_COMPLEX
a = mxGetDoubles(A); % interleaved complex
/* ... */
#else
ar = mxGetPr(A); % real part
ai = mxGetPi(A); % iaginary part
/* ... */
#endif

  3 Comments

Jan
Jan on 24 Oct 2020
Thanks, Bruno. I spent my time to meet a woman.
When does the data copy happen? Is it performed implicitely for all inputs, even for deeply nested structs, or dynamically on demand when an array is accessed by mxGetPi/Pr? What about mxGetData()?
What happens, if my -R2017b compiled MEX file creates a variable with complex data and calls another function through mexCallMATLAB() or mexRunMexFile()? Does the mxArray header contain a flag, which declares the state of the current state of the interleaved format?
Do you know, which library was the killing app to decide for the interleaved format?
James Tursa
James Tursa on 25 Oct 2020
Caveat: I am unable to confirm this until I get on a PC with MATLAB in a couple of days.
I believe that in R2018a and later there are no R2017b mxArray structures supported at all. Everything in the background is strictly interleaved complex. The mxGetPr and friends work with deep copies of the data blocks off to the side, NOT with R2017b style mxArray structures off to the side. Every mxArray that gets created and passed around is strictly interleaved complex. WHEN the data copies happen I will have to investigate later when I have MATLAB access.
Bruno Luong
Bruno Luong on 25 Oct 2020
"When does the data copy happen? Is it performed implicitely for all inputs, even for deeply nested structs, or dynamically on demand when an array is accessed by mxGetPi/Pr? What about mxGetData()?"
It copies the first time user invokes mxGetPr (first is for the same mxArray). On my laptop it takes 20-30 ms for 1e7-element array.
I guess same thing happens when mxGetPi, mxGetData is called first.
The test script and c-code (windows specific) is bellow.
I guess there must be another copy to group data in interleaved storage when mex returns. It seems it triggered also when the first mxSetxxx is invoked.
The conclusion is better to rework your old mex files with the new interleaved storage to get out the last ounce of performance.
File testInterleavedGetPr.c
/*
* testInterleavedGetPr.c
*
* benchmark mxSetPr
* Compile:
* mex testInterleavedGetPr.c
* mex -R2018a testInterleavedGetPr.c
*/
#include <Windows.h>
#include <stdio.h>
#include "mex.h"
typedef unsigned int UInt32;
double PCFreq;
LONGLONG StartClock;
// Clock related functions----------------------------------------------------------------------------
void InitClock(void)
{
LARGE_INTEGER li;
if (!QueryPerformanceFrequency(&li))
mexErrMsgTxt("QueryPerformanceFrequency failed!\n");
/* Cycles by ms */
PCFreq = (double)(li.QuadPart) / 1000.0;
}
void tic(void)
{
LARGE_INTEGER li;
QueryPerformanceCounter(&li);
StartClock = li.QuadPart;
}
/* Returns elapsetime in ms since last call of tic () */
UInt32 toc(void) /* UInt32 is sufficient for 50 days */
{
LARGE_INTEGER li;
double dres;
QueryPerformanceCounter(&li);
dres = (double)(li.QuadPart-StartClock) / PCFreq;
return (UInt32)dres;
}
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
UInt32 t, i;
double *p;
InitClock();
for (i=0; i<3; i++)
{
#if MX_HAS_INTERLEAVED_COMPLEX
tic();
p = mxGetDoubles(prhs[0]);
t = toc();
mexPrintf("\tmxGetDoubles time = %i\n", t);
#else
tic();
p = mxGetPr(prhs[0]);
t = toc();
mexPrintf("\tmxGetPr time = %i\n", t);
tic();
p = mxGetPi(prhs[0]);
t = toc();
mexPrintf("\tmxGetPi time = %i\n", t);
tic();
p = mxGetData(prhs[0]);
t = toc();
mexPrintf("\tmxGetData time = %i\n", t);
#endif
}
}
File testInterleavedSetPr.c
/*
* testInterleavedSetPr.c
*
* benchmark mxSetPr
* Compile:
* mex testInterleavedSetPr.c
* mex -R2018a testInterleavedSetPr.c
*/
#include <Windows.h>
#include <stdio.h>
#include "mex.h"
typedef unsigned int UInt32;
double PCFreq;
LONGLONG StartClock;
// Clock related functions----------------------------------------------------------------------------
void InitClock(void)
{
LARGE_INTEGER li;
if (!QueryPerformanceFrequency(&li))
mexErrMsgTxt("QueryPerformanceFrequency failed!\n");
/* Cycles by ms */
PCFreq = (double)(li.QuadPart) / 1000.0;
}
void tic(void)
{
LARGE_INTEGER li;
QueryPerformanceCounter(&li);
StartClock = li.QuadPart;
}
/* Returns elapsetime in ms since last call of tic () */
UInt32 toc(void) /* UInt32 is sufficient for 50 days */
{
LARGE_INTEGER li;
double dres;
QueryPerformanceCounter(&li);
dres = (double)(li.QuadPart-StartClock) / PCFreq;
return (UInt32)dres;
}
#define N 10000000
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
UInt32 t, i;
double *p, *pr, *pi;
InitClock();
p = mxMalloc(sizeof(double)*N);
plhs[0] = mxCreateDoubleMatrix(1, N, mxCOMPLEX);
for (i=0; i<3; i++)
{
#if MX_HAS_INTERLEAVED_COMPLEX
p = mxMalloc(2*sizeof(double)*N);
tic();
mxSetDoubles(plhs[0], p);
t = toc();
mexPrintf("\tmxSetDoubles time = %i\n", t);
#else
pr = mxMalloc(sizeof(double)*N);
pi = mxMalloc(sizeof(double)*N);
tic();
mxSetPr(plhs[0], pr);
t = toc();
mexPrintf("\tmxSetPr time = %i\n", t);
tic();
mxSetPi(plhs[0], pi);
t = toc();
mexPrintf("\tmxSetPi time = %i\n", t);
tic();
#endif
}
}
MATLAB scripts
% script testInterleavedCopy.m
n=1e7;
z = rand(1,n)+1i*rand(1,n);
x = rand(1,n);
mex testInterleavedGetPr.c
fprintf('Get for complex array (<R2017b)\n');
tic
testInterleavedGetPr(z);
toc
fprintf('Get for non-complex array (<R2017b)\n');
tic
testInterleavedGetPr(x);
toc
mex -R2018a testInterleavedGetPr.c
fprintf('Get complex array (>=R2018a)\n');
tic
testInterleavedGetPr(z);
toc
fprintf('Get for non-complex array (>=R2018a)\n');
tic
testInterleavedGetPr(x);
toc
mex testInterleavedSetPr.c
fprintf('Set for complex array (<R2017b)\n');
tic
z = testInterleavedSetPr();
toc
mex -R2018a testInterleavedSetPr.c
fprintf('Set for complex array (>=R2018a)\n');
tic
z = testInterleavedSetPr();
toc
Result
>> testInterleavedCopy
Building with 'Microsoft Visual C++ 2017 (C)'.
MEX completed successfully.
Get for complex array (<R2017b)
mxGetPr time = 43
mxGetPi time = 0
mxGetData time = 0
mxGetPr time = 0
mxGetPi time = 0
mxGetData time = 0
mxGetPr time = 0
mxGetPi time = 0
mxGetData time = 0
Elapsed time is 0.045455 seconds.
Get for non-complex array (<R2017b)
mxGetPr time = 0
mxGetPi time = 0
mxGetData time = 0
mxGetPr time = 0
mxGetPi time = 0
mxGetData time = 0
mxGetPr time = 0
mxGetPi time = 0
mxGetData time = 0
Elapsed time is 0.000221 seconds.
Building with 'Microsoft Visual C++ 2017 (C)'.
MEX completed successfully.
Get complex array (>=R2018a)
mxGetDoubles time = 0
mxGetDoubles time = 0
mxGetDoubles time = 0
Elapsed time is 0.001904 seconds.
Get for non-complex array (>=R2018a)
mxGetDoubles time = 0
mxGetDoubles time = 0
mxGetDoubles time = 0
Elapsed time is 0.000151 seconds.
Building with 'Microsoft Visual C++ 2017 (C)'.
MEX completed successfully.
Set for complex array (<R2017b)
mxSetPr time = 56
mxSetPi time = 0
mxSetPr time = 0
mxSetPi time = 0
mxSetPr time = 0
mxSetPi time = 0
Elapsed time is 0.110597 seconds.
Building with 'Microsoft Visual C++ 2017 (C)'.
MEX completed successfully.
Set for complex array (>=R2018a)
mxSetDoubles time = 0
mxSetDoubles time = 0
mxSetDoubles time = 0
Elapsed time is 0.022854 seconds.
>>

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!