Symmetric Matrix-Vector Multiplication with only lower triangular stored

I have a very big (n>1000000) sparse symmetric positive definite matrix A and I want to multiply it efficiently by a vector x. Only the lower triangular of matrix A is stored with function "sparse". Is there any function inbuilt or external that anyone knows of that takes as input only the lower triangular and performs the complete operation Ax without having to recover whole A (adding the strict upper triangular again)?
Thank you very much,
Jan

 Accepted Answer

Plese check if this is efficient in your case:
A = rand(5, 5);
A = A + A'; % Symmetric example matrix
B = tril(A); % Left triangular part
x = rand(5, 1);
y1 = A * x;
y2 = B * x + B.' * x - diag(B) .* x;
y1 - y2
ans = 5×1
1.0e+-15 * 0 0 -0.4441 0 0
I assume that Matlab's JIT can handle B.' * x without computing B.' explicitly. Alternative:
y2 = B * x + B.' * x - diag(diag(B)) * x;

6 Comments

Hi Jan,
Thank you for your answer.
I thought that the transpose should be avoided. "I assume that Matlab's JIT can handle B.' * x without computing B.' explicitly." In that case, your answer is probably the fastest.
I'll wait a few more minutes before accepting your answer, to see if anyone has something else to say.
Have a nice day :)
Hi Jan,
the transpose is computed explicitily every time the multiplication is done. Recovering A completely beforehand is actually faster (I am using this to let the pcg method compute multiplications faster every iteration). It would be great to find a function that doesn't need the transpose.
Thank you fo the suggestion,
Jan
Here some timings:
rep = 1e3;
n = 1000;
A = rand(n) .* (rand(n) < 0.1);
A = (A + A');
fprintf('Sparsity: %g\n', nnz(A)/numel(A));
B = tril(A);
x = rand(n, 1);
disp('Full:')
tic;
for k = 1:rep
y1 = A * x;
end
toc
tic
for k = 1:rep
y2 = B * x + B.' * x - diag(B) .* x;
end
toc
tic
for k = 1:rep
yWrong = B * x + B * x - diag(B) .* x;
end
toc
disp('Sparse:')
A = sparse(A);
B = sparse(B);
tic
for k = 1:rep
y1 = A*x;
end
toc
tic;
for k = 1:rep
y2 = B * x + B.' * x - diag(B) .* x;
end
toc
tic; % Wrong result, just to test if B.' is evaluated:
for k = 1:rep
yWrong = B * x + B * x - diag(B) .* x;
end
toc
tic; % Clayton Gotberg's solution without B.':
for k = 1:rep
y2 = B * x + (x.' * B).' - diag(B) .* x;
end
toc
tic
for k = 1:rep
y2 = B * x + B.' * x - diag(diag(B)) * x;
end
toc
% Full:
Elapsed time is 0.511407 seconds. A*x
Elapsed time is 0.979235 seconds. B*x + B.'*x - diag(B) .* x
Elapsed time is 1.035910 seconds. wrong y, almost same time
% Sparse:
Elapsed time is 0.187078 seconds. A*x
Elapsed time is 0.253077 seconds. B*x + B.'*x - diag(B) .* x
Elapsed time is 0.252763 seconds. wrong y, transpose no expensive
Elapsed time is 0.262800 seconds. Why is Clayton's not faster?!
Elapsed time is 0.258216 seconds. diag(diag(B))
Thank you gentlemen for experimenting. In fact, when A grows larger, the difference between A*x and B*x + B.'*x - diag(B) .* x grows larger. Unfortunately, taking the transpose doesn't exploit the symmetry of A.
I believe there is an algorithm that would do this without explicitily using the transpose, I have to keep looking for it. If there is nothing implemented in matlab I'll just keep using the full matrix.
Thank you @Jan and @Clayton Gotberg for the contribution,
Jan
You may try taking advantage of the reshape command as @Bruno Luong suggested below, to see if it's any faster than the transpose for row and column vectors.
When you say that the difference between A*x and B*x+B'x-diag(B).*x gets larger, do you mean that the time cost is larger or that the numbers actually change? Mathematically, that shouldn't be happening, but computers+numbers == funny things.
If the time cost just becomes too high, I can take another pass at it with an eye to exploiting the symmetry of the problem. My gut reaction was this handy identity, but there have to be a few dozen more ways to skin this cat.
im sorry for the late reply. I mean that the time cost gets larger.
There are obvious ways to exploit the symmetry of the problem such as doing the multiplication and sum as usual but change i and j index when arriving at the diagonal such that it keeps using the lower triangular elements. But programming this won't be nearly as efficient as Matlabs inbuilt multiplications, even if it exploits the symmetry.
Thank you for the help and the experimenting,
Jan

Sign in to comment.

More Answers (1)

If is a symmetric matrix and is the lower triangular part of the matrix and is the upper triangular part of the matrix:
where the diagonal function only finds the diagonal elements of . This is because of a few relations:
To save time and space on MATLAB (because the upper triangular matrix will take up much more space), take advantage of the relations:
To get:
Now, the MATLAB calculation is
A_times_x = A_LT*x+(x.'*A_LT).'+ diag(A_LT).*x;
This should only perform transposes on the smaller resultant matrices.

7 Comments

This is a smart approach. To my surprise it is not faster than (A_LT.' * x) - at least in R2018b.
If I'd spent a little less time on the LaTeX I might have gotten there first as well haha. I'm not sure what the etiquette is for this forum; since your answer was nearly the same as mine except earlier (and is now far more detailed) should I delete mine or leave it here?
I'm also surprised that (A_LT.' * x) and (x.' * A_LT).' evaluate in nearly the same time. I'd guess it might have something to do with the fact that the second operation has more layers, making it more difficult for MATLAB to interpret it correctly to apply shortcuts, but I have no real knowledge of MATLAB's backend. I'll have to trust MathWorks more the next time I'm doing operations over big data.
"This should only perform transposes on the smaller resultant matrices.é
I believe MATLAB parse/egine is intelligent enought to call the corresponding BLAS that does not do any explicit transposition on both operants regardless if there is transposition operator or not.
I ran my own small test in MATLAB R2020b, directly comparing (A_LT.' * x) and (x.' * A_LT).'. The attached .mat file has the x and A_sparse_tri I used.
function [T1,T2] = test_func(rep,x,A_sparse_tri) % keeping the same x and A
for k = 1:rep
tic
A_sparse_tri.'*x;
T1(k) = toc;
end
for k = 1:rep
tic
(x.'*A_sparse_tri);
T2(k) = toc;
end
end
T1_benchmark = [min(T1) mean(T1) max(T1)];
T2_benchmark = [min(T2) mean(T2) max(T2)];
performance = T2_benchmark-T1_benchmark
% Result for 10000 reps:
1000*[T1_benchmark;T2_benchmark] =
0.0320 0.0395 2.7669
0.0983 0.1005 0.3540
performance =
0.0000663 0.00006104 -0.00241290
So Jan's function is usually three times faster and it has the record for a minimum speed, but mine has the lowest maximum evaluation time. In the end I guess runtime depends on something deeply personal to your computer, but I would recommend Jan's function unless you need to keep evaluation time consistently low.
@Clayton Gotberg: "I'm not sure what the etiquette is for this forum; since your answer was nearly the same as mine except earlier (and is now far more detailed) should I delete mine or leave it here?"
Please leave your answer here. Although the difference between the answers is small, it is valuable e.g. to see that (A.'*x) and (x.'*A).' are processed with the same speed.
The JIT acceleration can be tricky and can cause misleading timings. In your test code, the iterative growing of the T1 and T2 vectors can influence the timings. In some cases the JIT seems to perform a dead-code elimination, such that the expression (x.'*A_sparse_tri); is removed completely, because its output is not stored. Sometimes the JIT recognizes repeated code, such that the actual computation is performed once only. The function timeit is more accurate, but I'm still in doubt, whether providing anonymous functions introduce a bias also.
Might be you can experiment with this as well
reshape((x.'*A),[],1)
I begin to understand why engineers are specifically trained and hired to test software. It's deeply interesting to get this peek into what MATLAB does to ensure I can keep writing ill-considered code.

Sign in to comment.

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Asked:

on 24 Apr 2021

Commented:

on 27 Apr 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!