Vectorise or Parallel Computing
3 views (last 30 days)
Show older comments
Can this for loop be vectorized or use parfor instead? If so, how should I do it?
for edgeID = 1:size(IE,1)
self = selfs(edgeID);
sdl(self) = sdl(self)+sdl_edge(edgeID); % add frac to self
res(:,self) = res(:,self)+flux_edge(:,edgeID); % add flux to self residual
end % internal edge iteration ends
"selfs" is an array with some order. That means I want to loop over this "order array" and fill in some value according to that order (not in the order of 12345).
I have tried several ways but failed...
2 Comments
Accepted Answer
Jan
on 21 Nov 2019
This loop cannot be parallelized. If flux_edge is a vector and not a matrix, accumarray would solve the problem efficiently. Try this:
% UNTESTED and most likey BUGGY!
sdl(selfs) = accumarray(selfs, sdl_edge);
resCell = splitapply(@(c) {sum(c, 2)}, flux_edge, selfs);
res(:, selfs) = cat(2, resCell{:});
The values of selfs are missing. Therefore I cannot test the code and I assume, it contains serious bugs. I assume you can find the remaining problems and modify the code until it solves your needs.
If the problem is time-critical (the bottleneck of the total program), I'd write a C-mex function. Accumulating in cells and joining them afterwards is not efficient for the memory consumption.
The size of selfs matters. It might be more efficient to collect the equal values at first by unique and run the loop over this list:
% UNTESTED
v = unique(selfs);
sdl = zeros(1, 7219);
res = zeros(5, 7219);
for iv = 1:numel(v)
av = v(iv);
mask = (selfs == av);
sdl(av) = sum(sdl_edge(mask));
res(:, av) = sum(flux_edge(:, mask), 2);
end
If this has a fair speed, you can parallelize it with parfor.
% UNTESTED
v = unique(selfs);
nv = numel(v);
A = zeros(1, nv);
B = zeros(5, nv);
parfor iv = 1:nv
av = v(iv);
mask = (selfs == av);
A(iv) = sum(sdl_edge(mask));
B(:, iv) = sum(flux_edge(:, mask), 2);
end
sdl = zeros(1, 7219);
sdl(v) = A;
res = zeros(5, 7219);
res(:, v) = B;
4 Comments
Jan
on 23 Nov 2019
Sorry, the original code needs 0.002 seconds on my R2018b system. I do not see a way to accelerate this substantially, because this very fast already. My suggestion solutions are ways slower than the original approach.
Do you work with much larger problems than the posted data?
More Answers (0)
See Also
Categories
Find more on Creating and Concatenating Matrices in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!