Workers during parfor-loop shutting down/going to sleep
Dear Matlab-community,
I have an issue with parfor-loops which slowly but steadily drives me crazy. I am currently using the parallel-processing-toolbox for some downscaling experiments of geospatial data. Within the loop, I am loading time-series from two 3D-NetCDF-Files (each time-series represents one single pixel), do some calculations with the data and write the output to another NetCDF-File. The 3D-arrays have the dimensions 697x381x1462 (longitude x latitude x time). In total, I want to process around 150 000 pixels.
When I start to run the code, the parfor-loop runs perfectly. Even the first 50 000 pixels are done within several hours. However, after that, the workers (one after another) go to some sleep-mode until the whole parlor-loop does not iterate any more. When I do a restart (i.e. I keep the already processed pixels and continue with the first "unprocessed" pixel), the code again runs through the first pixels (around 1000), after which the workers are again shutting down or going to sleep.
The duration of each iteration does not change (please look at the attached figure). This holds also true for the I/O-processes (i.e. the reading and writing of the data). Please note that I do not put any data in the memory (i.e. there is no variable which gets bigger and bigger!). All processes within the loop are completely independent from other iterations (i.e. I simply pass a pixel-ID which is then processed).
Even after countless tries, I can not manage to run through the whole dataset, which is absolutely annoying...
The code within the parfor-loop is added below. I would appreciate any help and I am really looking forward to any comments.
Best regards and many thanks in advance, Christof

The image shows 30 of the workers (the different colors and symbols), the duration of each iteration (y-axis) and the actual time (x-axis). Obviously, the workers 1 - 11 seem to stop iterating after less than 8 minutes.
parfor i = 1:length(ids)
% Get the ID of the current worker
t = getCurrentTask(); % Transform the ID to row- and column-indices
[rw, clm] = ind2sub([nlat nlon], ids(i)); % Synchronize all workers
labBarrier % Add a small delay to each worker --> avoid read-clashes
pause(t.ID/2); % Load data for cal/val
x_cal = squeeze(ncread(fnme_x_cal, varnme, [clm, rw, 1], ...
[1 1 Inf]));
y_cal = squeeze(ncread(fnme_y_cal, varnme, [clm, rw, 1], ...
[1 1 Inf]));
x_val = squeeze(ncread(fnme_x_val, varnme, [clm, rw, 1], ...
[1 1 Inf])); % Check if the loaded vector contains real data; if this is not the
% case, the function replaces the current pixels with missing values
if ~all(isnan(x_val))
% Try to execute copula merge; if copula_merge can not be executed,
% replace the current pixels with missing values
try
[x_val_out, x_val_std, theta, cpla] = ...
copula_merge(x_cal, y_cal, x_val, merge_settings);
catch
x_val_out = 1e+20*ones(length(tme_out), 1);
x_val_std = 1e+20*ones(length(tme_out), 1);
val_ids = 1e+20*ones(length(tme_out), 1);
theta = 1e+20;
cpla = 0;
end
else
x_val_out = 1e+20*ones(length(tme_out), 1);
x_val_std = 1e+20*ones(length(tme_out), 1);
val_ids = 1e+20*ones(length(tme_out), 1);
theta = 1e+20;
cpla = 1e+20;
end % Write the results to the output files
ncwrite(fnme_out, varnme, x_val_out, [1, rw, clm]);
ncwrite(fnme_std, [varnme, '_std'], x_val_out, [1, rw, clm]); ncwrite(fnme_cpla, 'Copula', cpla, [rw, clm]);
ncwrite(fnme_theta, 'Theta', theta, [rw, clm]);end
1 Comment
Answers (1)
2 Comments
See Also
Categories
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!