Is it possible to build a parfor loop where workers share and edit (only) one variable array?

Good day!
I am building a ray tracer to solve for thermal radiation within a solid.
Ray tracers are usually considered an exemplary algorithm for parallel computing, as you can "launch" rays on parallel workers. All the worker's computations are entirely independent, and by the end, you can sum up the results of the workers.
Nevertheless, I need to modify this slightly and make the computation of each worker dependent on a variable that needs to be accessible and editable by the other workers. In other words, most of the calculation is still independent but, in some intermediate steps, I need to check a "bookkeeping" or a "counter" (defined for the solid), and depending on the value stored there, the worker decides what to do. Each worker would also need to edit or update this counter after checking its value.
Is there a way to do this?
If not, I see a way of building a parfor inside a while loop, where only the independent calculation would be done in parallel workers. Nevertheless, given the number of iterations, I feel this method would most likely be slower than just a for loop.
Looking for your comments, and thank you very much!
Sebastian.

 Accepted Answer

If I've understood correctly, you could use spmd to do something like this. spmd allows communication between the workers, so that you can exchange values on the workers. It's a bit like your idea of nesting parfor inside while, but everything stays on the workers. Here's a very rough sketch of how things might proceed:
spmd
done = false;
while ~done
% each worker has an independent copy of 'myValue', but
% we'll combine these later. For the sake of this simple
% example, I'm going to assume we're getting a numeric
% increment from each ray-trace, and we want to add them
% all up
myValue = 0;
% for-drange causes each worker to execute a different portion of
% the loop. Worker 1 gets 1:(numRays/numlabs), and so on.
for rIdx = drange(1:numRays)
myValue = myValue + traceRay(rIdx);
end
% Use GPLUS to find the total by communicating across
% all workers
totalValue = gplus(myValue);
% termination condition.
done = totalValue > thresh;
end
end
I hope this makes some sort of sense. Instead of gplus, you might need the more general gop or perhaps you might need to roll your own communication using labSend and labReceive.

6 Comments

Hi Edric,
First of all, thank you very much for your time and help.
Your proposed syntax does look a lot like what I need. I've been thinking about the implementation, and I would need some modifications. If I may, please let me add the changes and ask some follow-up questions within the code:
spmd
% done = false;
% while ~done % I sent the while-loop inside the for-drange
myValue = 0; % Shared 3D-array across workers
% for-drange causes each worker to execute a different portion of
% the loop. Worker 1 gets 1:(numRays/numlabs), and so on.
for rIdx = drange(1:numRays)
% so inside the for-drange is where we trace each ray.
% The dependency on the shared array is inside this loop. It would
% be something like the following:
[flag,position]=traceRay(rIdx,direction); %flag="hit" or "miss" @position
while (flag=="hit") % I would nest the while here
myValue(position) = myValue(position) + 1;
myValue = gplus(myValue); % is it possible to update myValue here?
% the next line needs the updated
% value
if myValue(position)<capping_number
[flag,position]=traceRay(position,new_direction) % we obtain a new hit/miss at a new position
else
flag=="miss";
end
end
end
% end % we nesded the while loop inside the for-drange
end
Finally, I am not familiar with the operations/functions you are showing me, so I hope it's OK if I take my time to implement this before I close the question.
Again, thank you!
Sebastian.
The main thing you need to bear in mind is that all communication within spmd is either "two-sided" or "collective".
The pair labSend and labReceive are "two-sided" in the sense that your code needs to guarantee that matching calls are made so that a message can be delivered. You can't simply call labSend and hope that someone will make the labReceive call later - your algorithm must exactly match things up.
Things like gplus , gop, and labBroadcast are "collective" - this means that all workers must guarantee to make matching calls at the same time. (Well, the timing isn't required to be precise - but no worker can complete a collective call until every other worker has at least started the call).
The reason I bring this up is that your inner while loop has a flaw that will cause it to deadlock - the termination condition for the loop is not guaranteed to be the same on each worker - as written, one worker could complete the loop before the others, and then it wouldn't enter the collective gplus method.
Actually, thinking about it, there's another flaw - calling gplus inside a for-drange loop is itself a problem. If the range of the loop isn't evenly divisible by the number of workers, then different workers will get different numbers of iterations.
I'm not sure enough of the overall shape of your existing algorithm to know quite what to suggest. But basically the overall pattern is that you need to make sure each worker in the spmd block does a bunch of "independent" work, and then all workers collectively update the shared value, remembering that the collective updates must happen the same number of times on each worker.
Mmm yes. You read the script perfectly. Different loops will have different "running times". This is inherent to my tracer algorithm; one ray might be launched, traced, and lost without reflection/reemission (not even completing a single while loop), but another ray may be bouncing around for very long.
My question using gplus would be if the workers may wait for one/other lagged worker/s to do the variable update? The myValue(position) doesn't need to end up equal to capping_number, so if the updating yields a number higher than the cap it's not a problem, as long as in the next iterations, the termination condition is met. (There's also a thing here with how not to compound the counters.. as they should not be resetted to =0 in every iteration but be adding new "hit" flags, but I think that's only a matter of writing it differently to avoid this problem).
Would the following idea to work around the problems you mention work?
Use labsend and labreceive having one worker running a separate script. This one worker will receive the data from all other workers at every iteration, add it, and share it back to the rest of the workers at the start of the next iteration—something like one worker doing the counting and the rest doing the actual ray tracing.
Best,
Sebastian
Something like you propose might work. Here's a sketch
spmd
% This is the state we wish to share
overallState = 0;
if labindex == 1
numComplete = 0; % number of workers that are finished
while numComplete < (numlabs-1)
% There are workers still working
[data, srcIdx] = labReceive(); % receive from any worker
if isempty(data)
% Worker is finished, doesn't expect reply
numComplete = 1 + numComplete;
else
overallState = overallState + data; % or whatever
labSend(overallState, srcIdx); % reply with updated state
end
end
else
for idx = 1:n
% Do some independent work
data = doStuff(overallState);
% Need updated state - send 'data' to lab 1
labSend(data, 1)
% Reply isupdated state
overallState = labReceive(1);
end
% This worker is now complete, but we must tell worker 1
labSend([], 1);
end
end
This may or may not be efficient depending on how many workers you have vs. how long the "independent work" takes. One of the challenges here might be "load-balancing" - i.e. ensuring each worker has enough "independent work" to do. If you wanted, you could get worker 1 to send the work as well as the overallState to the workers.
I will try implementing this and see how it goes.
The load will be roughly balanced, as the number of rays is large (>1e5), and the probabilities of each ray bouncing or being terminated are similar irrespective of its position.
I think this answers the original question, and now what's left is for me to do is the actual implementation of the methodology.
Thank you very much for your help!!!!!
Sebastian.

Sign in to comment.

More Answers (0)

Categories

Find more on Parallel Computing Toolbox in Help Center and File Exchange

Products

Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!