MATLAB Answers

Parfor loop using 'saves' correctly on around half of iterations

1 view (last 30 days)
aboharbf
aboharbf on 15 Dec 2019
Commented: aboharbf on 18 Dec 2019
So I'm using the parfor toolbox to do a batch analysis of data. In the analysis script, there are around 50 'save(X, ... '-append')' where variables are added on to an existing file. Over 200 runs, half of the runs seem to not be running or actually saving this line. 2 saves sitting in a very nearby earlier location are reliably saved, but something happens after this point.
The data is being generated and used to make figures. This error doesn't seem to trigger any actual error and things proceed as normal, until I try to retrieve variables from the file, at which point another script I'm running after noticing it is missing these variables.
I've seen other discussions about parfor saving but in my case there is no error, and it works sometimes fully. the half of runs which don't store these variables are in blocks throughout the 200 (not all at the beginning or end) which seems to suggest a specific worker may be having the issue (I run 4). Any insights into this would be appreciated.

  0 Comments

Sign in to comment.

Answers (1)

Edric Ellis
Edric Ellis on 16 Dec 2019
In general, it is not safe to have multiple workers attempting to save to a single file. The results are likely to be unpredictable, as you have observed - that's because if two workers happen to attempt to save at precisely the same moment, then it's possible that only one of the saves will actually occur. I would recommend saving to unique files. The simplest way is to derive the filename from the parfor loop index, and then post-process on the client to amalgamate the data if required. A slightly more sophisticated approach would use a single file per worker by basing the file name on the task ID. (See: getCurrentTask).
% Clean up prior runs
delete data_*.mat
% Main loop - save to one file per worker
parfor i = 1:200
varName = sprintf('X_%d', i);
varValue = magic(i);
fname = getFname();
doSave(fname, varName, varValue);
end
% Client loop - amalgamate into a single file
fnames = dir('data_*.mat');
for i = 1:numel(fnames)
s = load(fnames(i).name);
if i == 1
args = {};
else
args = {'-append'};
end
save('result.mat', '-struct', 's', args{:});
end
% Simple wrapper around SAVE to allow it to be used in PARFOR
function doSave(fname, varName, varValue)
s.(varName) = varValue;
if exist(fname, 'file')
args = {'-append'};
else
args = {};
end
save(fname, '-struct', 's', args{:});
end
% Choose a unique file name per worker
function fname = getFname()
t = getCurrentTask();
if ~isempty(t)
fname = sprintf('./data_%d.mat', t.ID);
else
% Get here on the client
fname = './data_0.mat';
end
end

  1 Comment

aboharbf
aboharbf on 18 Dec 2019
So actually these are all saving to their own files. That is part of the oddity.

Sign in to comment.

Sign in to answer this question.