faster way to add many large matrix in matlab

12 views (last 30 days)
Say I have many (around 1000) large matrices (about 1000 by 1000) and I want to add them together element-wise. The very naive way is using a temp variable and accumulates in a loop. For example,
summ=0;
for ii=1:20
for jj=1:20
summ=summ+ rand(400);
end
end
After searching on the Internet for some while, someone suggests it's better to do with the help of sum(). For example,
sump=zeros(400,400,400);
count=0;
for ii=1:20
for j=1:20
count=count+1;
sump(:,:,count)=rand(400);
end
end
summ=sum(sump,3);
However, after I tested two ways, the result is
Elapsed time is 0.780819 seconds.
Elapsed time is 1.085279 seconds.
which means the second method is even worse.
So I am just wondering if there any effective way to do addition? Assume that I am working on a computer with very large memory and a GTX 1080 (CUDA might be helpful but I don't know whether it's worthy to do so since communication also takes time.)
Thanks for your time! Any reply will be highly appreciated!.
  9 Comments
Haining Pan
Haining Pan on 8 May 2018
@per isakson, I got even worse performance. The result now becomes
Elapsed time is 0.737624 seconds.
Elapsed time is 1.620767 seconds.
per isakson
per isakson on 10 May 2018
I assumed the question was about the speed of sum( .... );

Sign in to comment.

Accepted Answer

Jan
Jan on 8 May 2018
Edited: Jan on 4 Nov 2019
The main time is spent in rand() in your example. With using ones() instead, the runtime goes from 0.71 sec to 0.25 sec on my machine.
Instead of creating the matrices explicitely, you could think of solving the problem mathematically, if the matrices are really exp(i*x+j*y). So please post the real code, not just some dummy code, whose most expensive function is not part of the real problem at all.
  3 Comments
Jan
Jan on 8 May 2018
A cleaned version of the code would be:
x = linspace(nn(1),pp(1),NN);
y = linspace(nn(2),pp(2),NN);
b11 = b1(1);
b12 = b1(2);
b21 = b2(1);
b22 = b2(2);
s = 0;
for j = -Nmax:Nmax
for k = -Nmax:Nmax
s = s + aa(counter) * ...
exp(1i*((j*b11 + k*b21) * x + ...
(j*b12 + k*b22) * y));
end
end
The most expensive part is the exp function here. As far as I can see, the argument of exp() is a [1 x NN] vector and not a [NN x NN] matrix. Then the code should fail with an error message. Please post a running code. Replace functions like your energy by rand, if it is sufficient for the computations.
It is hard to suggest methods to accelerate code, which does not run at all. But the general idea is to reduce the number of exp evaluations. Use exp(a+b) = exp(a)*exp(b). Instead of creating the matrix as argument based on two vectors, calculate the exp at first and create the matrix afterwards. In addition you might be able to exploit, that x and y are created by linspace:
x = linspace(1, 10, 2000);
e1 = exp(x); % 2000 expensive exp() calls
c = x(2) - x(1);
% 2 expensive exp() calls and a cheaper cumulative product:
e2 = cumprod([exp(x(1)), repmat(exp(c), 1, length(x)-1)]);
The cumulative product is much cheaper, but suffers from the accumulated rounding error.
You can use an equivalent method to use the value of the exp function at k=n to get the value for k=n+1.
I recommend to write down the formula and simplify the equation with paper and pencil at first. Solving the sum by brute computing power is less efficient.
Haining Pan
Haining Pan on 10 May 2018
After several days attempting, I found a very straightforward method by using 3d matrix. For example, I can use a=rand(400,400,400) to directly create such the whole pages of matrices and sum(a,3) to get the sum. For this exact problem, I used x+y to create a 2d matrix and multiplied (.*) by a 1 by 1 by (2Nmax+1)^2 matrix of j and k to have exp(j*x+ k*y), which is a 3d matrix. Then simply take the sum by sum(..,3).
This is about 3 times faster, and even 10 times faster if I used CUDA.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!