In my code the parfor is slower than for loop when using backslah with matrices.

I have following code. I am using backslash for matrices and parfor loop is slower than for loop. What kind of problems are there in following code? Thank you so much.
A = radn(3000);
[n,~] = size(A);
I = eye(n);
c = [0,-1/5, 1/5,-1/10, 1/10];
b = [128/3, 20/9, 85/3, -15, -(515/9)];
E = zeros(n);
tic
for k = 1:5
E = E + b(k)*((I-c(k)*A)\(c(k)*A));
end
toc
tic
parfor k = 1:5
E = E + b(k)*((I-c(k)*A)\(c(k)*A));
end
toc
E=E+I;
my results with 8 workers are as follows:
Elapsed time is 2.656596 seconds
Elapsed time is 3.542917 seconds

 Accepted Answer

There is no problem.
There might be no benefit to run parfor on function that is designed with multithread and already exploit most resource of your CPU
Put parfor on top is just doing more harm than good.

7 Comments

To expand on Bruno's point, when replacing implicit multi-threading with explict (i.e., parfor), it's better to compare the results with single thread. For example, rerun your for-loop as follows
% True comparison of speedup with only a single core (no implicit
% multi-threading)
maxNumCompThreads(1);
tic
for k = 1:5
E = E + b(k)*((I-c(k)*A)\(c(k)*A));
end
toc
% What you're currently running as (multi-threading)
maxNumCompThreads("automatic");
tic
for k = 1:5
E = E + b(k)*((I-c(k)*A)\(c(k)*A));
end
toc
That's good idea Raymond to show the single-thread performance.
On my PC the times are
  • single thread for-loop Elapsed time is 4.999295 seconds.
  • multi thread for-loop Elapsed time is 1.611808 seconds.
  • parfor Elapsed time is 2.516956 seconds.
Thank you all for your considerations. In the case of accumulating matrices, i.e. E=E+..., is this could be a problem?
parfor i=1:N
x=x+i;
end
is allowed for number. But for matrices also it is allowed ? Could be otherway to accumlate matrices?
No, there is nothing specify that the reduction variable E must be scalar https://fr.mathworks.com/help/parallel-computing/reduction-variable.html
You can test a modification of the original code without adding, the overall timiings are almost similar with accumulation case.
Thank you so much Raymond Norris. I tried to make comparison with maxNumCompThreads(1) for for loop and maxNumCompThreads("automatic") for parfor loop. And now the parfor is 2 time faster than for loop.
Just to be clear, the parfor loop is 2x faster than running for for-loop in single threaded mode, yes (not compared to multi-thread mode)?
I did that: maxNumCompThreads(1) for .... end maxNumCompThreads("automatic") parfor .... end It should be like this, no? Thank you.

Sign in to comment.

More Answers (0)

Categories

Find more on Code Execution in Help Center and File Exchange

Products

Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!