Why is m-function overhead 100 times more than built-in overhead.
Show older comments
function Ignore(a)
end
Then run the following:
tic;
for i = 1 : 200000
Ignore(A);
end
toc;
Elapsed time is 0.945148 seconds. I've found on my system Matlab can do about 200,000 empty function calls per second. So function call overhead is about 5 microseconds.
A = rand(100000, 1);
tic;
for i = 1 : 200000
b = numel(A);
end
toc;
Elapsed time is 0.011971 seconds. So here the number of calls per second is around 20,000,000. This is almost 100 times faster even with a numel() lookup.
I'm curious as to why the m-function overhead is so high. By the way I've also tried putting this code in an m-file and running it. The result is the same.
I've found that with these m-function timings. it is possible for code in a loop to be swamped by overhead time. For example calling a function 10,000,000 times - the function calls 10 other functions which in turn call an average of 5 functions say. This gives 500,000,000 function calls - which would take about 2500 seconds or 40 minutes. If on average each function does only 1 microsecond of work (this is not implausible at all), the actual computation time is only 500 seconds.
Thus: 8 minutes computation + 40 minutes call overhead. Just to heighten the sense of the ridiculous, with a really massive computation, this could translate to an 8 hours computation taking two days.
Could anyone comment on the reason for the high overhead, what to do to mitigate this problem or any other useful information related to this.
An unpleasant surprise in my code (the second in a few days) prompted me to write this. This was the first unpleasant surprise.
The second is the nnz() function. For so long I had always implicitly assumed that it had constant time complexity. I just realized I was wrong and that it has constant time only for sparse matrices.
If nnz() were constant time (Matlab arrays already store other useful information), it would also mean that all(A) and any(A) would be constant time and considering how often these are used, it would be a significant improvement.
Thank you.
[Second loop was edited for typos.]
10 Comments
Elapsed time is 0.011971 seconds. So here the number of calls per second is around 20,000,00. This is almost 100 times faster even with a numel() lookup.
Maybe I'm missing something, but I don't see how you get that number. Your loop iterates 1000 times, for a total of 0.011971 seconds. Calls per second should therefore be 1000/0.011971 = 83535. Also, why the call to rand() inside the tic...toc?
SK
on 19 Sep 2014
I can really only speculate about the difference. Presumably the JIT can do optimization on built-in functions in for-loops that it cannot do with user-supplied Mcode. But I wouldn't know the underlying reason for that.
In any case, calling functions repeatedly in really long loops, whether mfunctions or builtin functions, has always been contrary to MATLAB programming philosophy. That's why one needs to vectorize.
If nnz() were constant time (Matlab arrays already store other useful information), it would also mean that all(A) and any(A) would be constant time and considering how often these are used, it would be a significant improvement.
That would add significant overhead to linear algebra and other operations, since an implicit nnz() would have to be called every time a new matrix was generated. In the matrix multiplication example below, this increases execution time by about 50%,
A=rand(5000);
tic; B=A.*A;toc;
Elapsed time is 0.058789 seconds.
tic; nnz(B); toc
Elapsed time is 0.029959 seconds.
There is lots of code that simply cannot be vectorized - perhaps even the majority.
There is some code that cannot be vectorized, but if that were the majority, I tend to think TWM would have gone out of business ages ago. If you have a routine which is not obviously vectorizable, maybe you should post it and draw upon more minds to see how it could be attacked.
Well, okay, but then it's still unclear why you call it a "surprise" that mfunctions are slower than builtins. If they weren't, would you ever need mex files? Or if you still did need mex files, what problem is solved once mfunctions catch up to builtins in speed?
SK
on 22 Sep 2014
SK
on 22 Sep 2014
Answers (0)
Categories
Find more on Matrix Indexing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!