Three ways to do the same integration leading to three different speeds: why?

Question

0 votes

Here is a code that does the trapezoid rule applied to a function on [0, 100] and where I use a random vector with 1000 entries. The integration is repeated 1 million times. The output gives my computer 3 seconds for the first method, 9 seconds for the second, and 11.5 seconds for the third.

But the only difference between the third and the first is that the third method puts the calculation into a function. The difference between the second and the first is the vector notation. I don't understand why these are leading to such different efficiencies. Moreover, I am worried about the fact that putting an identical script into a function multiples the time by a factor of three!

Here is the setup:

x = linspace(0, 100, 1000);
dx = x(2) - x(1);
y = rand(size(x));

First method:

G = 0;
tic
for j = 1:1000000
    g = 2*y;
    g(1) = g(1)/2;
    g(end) = g(end)/2;
    G = G + dx/2*sum(g);
end
toc
disp(['RUN 1: G = ', num2str(G)]);

Second method

G = 0;
tic
for j = 1:1000000
    G = G + dx/2*sum([y(1) 2*y(2:end-1) y(end)]);
end
toc
disp(['RUN 2: G = ', num2str(G)]);

Third method:

G = 0;
tic
for j = 1:1000000
    G = G + mytrapz_equal(dx, y);
end
toc
disp(['RUN 3: G = ', num2str(G)]);

The same function as Run 1:

      function F = mytrapz_equal(dx, y)
          g = 2*y;
          g(1) = g(1)/2;
          g(end) = g(end)/2;
          F = dx/2*sum(g);    
      end
  end

Can anybody shed some light?

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Mohammad Abouali on 8 Nov 2014

0 votes

Calling a function has some overhead. So you are adding that overhead 1,000,000 times and that's why it takes longer.

The same goes for second method, You are calling sum function so relative to first implementation you are adding the function call overhead 1,000,000 times to the total time.

7 Comments
Show 5 older comments Hide 5 older comments

Mohammad Abouali on 8 Nov 2014

Edited: Mohammad Abouali on 8 Nov 2014

When you call a function a lot of internal procedure needs to happen and the flow control of the program needs to be modified and to load proper instruction to CPU and so on.

Remember that MATLAB also uses pass by value for variables, so the least is that a new copy of your variable needs to be created; once the function is out of scope matlab clears the internal variables in the memory that have gone out of scope while that won't happen when you are not calling a funciotn; all these takes some time and they add up once you do that 1,000,000 times.

Having said this, this doesn't mean that you should avoid function calls.

You are right about the sum function. It exists in both cases. I think it is due to the way that you call it. In first case you build variable g and pass that sum function. g does not go out of scope and it still exists in memory for the next loop (probably it even exists in cache for the next loop).

Howeverm in second method you pass [y(1) 2*y(2:end-1) y(end)] as an argument to sum function. MATLAB needs to temporarily create that variable, pass it to sum and once sum is done it destroys that variable. So the next loop iterations it has to recreate it and destroy it again. So, it would add some more time. I think this is what happening between case 1 and 2.

Theo on 8 Nov 2014

Edited: Theo on 8 Nov 2014

Open in MATLAB Online

Hi,

Thanks for your reply. I think I understand your explanation of the required variables that are created within new function definitions.

However, I'm not sure of your explanation for Run 2. If I change it to:

    g = [y(1) 2*y(2:end-1) y(end)];
    G = G + dx/2*sum(g);

then in terms of scope, I don't see a difference between Runs 1 and 2. However, Run 2 still takes 10 seconds to run. I think the decrease in speed is to do with the concatenation operation of the vector.

One question is: suppose I want to keep the speed of Run 1, but I want to use a function to just clean up the code. How do I do that? Would I need to declare variables as global?

One last question: suppose I alter the Run 3 function to:

    function F = mytrapz_equal(dx, y)
        F = dx/2*(sum(2*y) - y(1) - y(end));    
    end

I would have thought that there are no new copies of variables that need to be created, since the only inputs and outputs will be exactly those variables that are later used.

Mohammad Abouali on 8 Nov 2014

Edited: Mohammad Abouali on 8 Nov 2014

Open in MATLAB Online

Good point. you are correct. It is due to concatenation here is a test:

tic
for i=1:1000000
  [y(1) 2*y(2:end-1) y(end)];
end
toc
tic
for i=1:1000000
    g = 2*y;
    g(1) = g(1)/2;
    g(end) = g(end)/2;
end
toc
Using concatenation: Elapsed time is 4.137758 seconds.
Not using concatenation: Elapsed time is 0.522484 seconds.

By the way, here is some timing on my system:

method 1: Elapsed time is 2.480551 seconds.
method 2: Elapsed time is 5.866363 seconds.
method 3: Elapsed time is 4.111007 seconds.

Method 3 consistently gives lower time than method 2 on my system, but more than method 1.

What MATLAB version do you use?

Regarding using a function in your case and still keeping it the same time I was unsuccessful. I tried using inline function using @ operator and still the timing jumped higher. The thing is that this function does so little that I don't think it is even worth putting it in a separate function. Perhaps a longer and more complicated function.

It was a conference presentation given by a German Professor (forgot his name) on supercomputing and actually he was talking about this very same issue. He was mentioning sometimes to achieve better performance, one needs to deviate from good programing style (he was talking about functional and object oriented programing style). Although he was not encouraging to deviate from this style of coding, his point was that we need new style of coding and standard for high-performance environments.

One main reason in Java that the basic data types are not an object is exactly due to reduction in performance. So, there is always a compromise between readability of the code or sticking to good-programming style and its performance.

Mohammad Abouali on 8 Nov 2014

Yes, I am using 2014b.

This is not confirmed, but I have noticed that some of my codes are running faster on R2014b relative to R2104a. There is a specific code that I recall it was taking about 43 seconds on average and now it is taking about 30 seconds. Although as I said, this is not confirmed. This could be due to many other things. I never had time to really check this.

Theo on 10 Nov 2014

Thank you for the interesting discussion. I will try and grab a newer version of Matlab from my department.

Sign in to comment.

Answer 2

Roger Stafford on 8 Nov 2014

0 votes

As to the difference between method 1 and 2, for each of the million trips through the loop method 2 has to construct and then abandon a new 1000-element array, namely [y(1) 2*y(2:end-1) y(end)], which it then hands to 'sum' and that takes allocation time that doesn't occur in method 1. That's just a guess on my part.

You should realize that accounting for timing with differing though equivalent code is fraught with uncertainty. First the Matlab language must be translated into C (I presume) and then on your computer the C is translated via a compiler into machine language appropriate for your computer, and strange and illogical things can happen to the timing in the process. A lot depends on the particular decisions the compiler writers made in coding this translation.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 3

Theo on 16 Jun 2015

Edited: Theo on 16 Jun 2015

0 votes

This is a later 'answer' added to complement the above answers about the overhead in concatenating arrays and modularizing via functions. There is a very nice and succinct article here about these very issues:

Three ways to do the same integration leading to three different speeds: why?

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

7 Comments
Show 5 older comments Hide 5 older comments

More Answers (2)

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Tags

Community Treasure Hunt

Three ways to do the same integration leading to three different speeds: why?

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

7 Comments Show 5 older comments Hide 5 older comments

More Answers (2)

0 Comments Show -2 older comments Hide -2 older comments

0 Comments Show -2 older comments Hide -2 older comments

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

7 Comments
Show 5 older comments Hide 5 older comments

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments