Debugging parfor

108 views (last 30 days)
Joan Puig
Joan Puig on 15 Jun 2011
Answered: Dimitrij Chudinzow on 29 Jun 2017
Hi,
We have been working on paralelizing our code, and we have found that when an error occurs inside a parfor it is hard to debug it. What methods do you use to figure out what is going on (specially when the serial version works perfectly)?
More specifically, when I run this code, I would expect to get a My:Error with the stack trace pointing to that particular line of code, but instead, we get an error pointing to an internal function and the stack trace shows the "parfor" line as being the source of the problem
clear();
clc();
r = [];
try
parfor i = 1:10
r(i) = rand(1,1);
if r(i)<0.9
error('My:Error','Try again');
end
end
catch le
le
for j = 1:numel(le.stack)
le.stack(j)
end
rethrow(le);
end
Output:
le =
MException
Properties:
identifier: 'My:Error'
message: 'Try again'
cause: {0x1 cell}
stack: [2x1 struct]
Methods
ans =
file: 'C:\Program Files\MATLAB\R2011a\toolbox\matlab\lang\parallel_function.m'
name: 'parallel_function'
line: 475
ans =
file: 'D:\SynapticPoint\SourceTrunk\Matlab\ScratchPad\scr_error_in_parfor.m'
name: 'scr_error_in_parfor'
line: 7
??? Error using ==> parallel_function at 475 Try again
Error in ==> scr_error_in_parfor at 7 parfor i = 1:10
>>

Answers (3)

Dimitrij Chudinzow
Dimitrij Chudinzow on 29 Jun 2017
My approach is to replace "parfor" by "for". This way you will find the line that causes troble, but unfortunately it'll take more time, since parallel computing will be disbaled for the particular loop.

Edric Ellis
Edric Ellis on 16 Jun 2011
Firstly, there should be few differences between running your code containing PARFOR with MATLABPOOL closed and with MATLABPOOL open, except that you can set breakpoints inside code called from within functions called. I.e. if you have code like:
parfor ii=1:10
x(ii) = myFcn(ii);
end
You can set breakpoints inside myFcn().
Secondly, if you put your code inside a function rather than a script, you should get better diagnostics. I simplified your code a little:
function pfeg
try
parfor i = 1:10
if rand < 0.9
error('My:Error','Try again');
end
end
catch le
getReport( le )
end
and this now gets the output:
Error using ==> parallel_function at 598
Error in ==> pfeg>(parfor body) at 5
Try again
Error in ==> pfeg at 3
parfor i = 1:10

Joan Puig
Joan Puig on 16 Jun 2011
Its true that the computational parts of the code generate the same errors with or without the matlabpool open, which is a good thing.
On the other hand, we have some "configuration" problems where for example:
-The java class path is not set correctly on the workers. -The state of the data cache might be different on the different workers -Database connections on the workers might be in a different state -Datafeed connections on the workers might be in a different state
All this situations are very hard to debug if we can't even find out what line of code is causing the problem
  1 Comment
Edric Ellis
Edric Ellis on 17 Jun 2011
Hi Joan, do you *not* get the line of code in the error stack when the parfor loop is inside a function body?
And yes, we only ensure that the MATLAB path is synchronised between client and workers, you must deal with other setup that's required.

Sign in to comment.

Categories

Find more on Parallel for-Loops (parfor) in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!