Each PARFOR Worker Writes to the Same File
Show older comments
I understand that having each worker write to a single file is a no-no. Perhaps as expected, when I run this code, it shows some corrupted values in the final output file; about 10% fails.
I have no interest in the output being in a deterministic order. The workers are spread across multiple Linux machines. The amount of time to complete a run is long compared to the time to write a single line of output.
Can someone recommend an alternative?
% Run a parametric study
var1 = (-60:0.5:60)';
var2 = (-110:0.5:110)';
var3 = (3.5:0.5:18.5)';
% Remove zero entries since their usage prohibited
var1(var1 == 0) = [];
var2(var2 == 0) = [];
var3(var3 == 0) = [];
NS = length(var1)*length(var2)*length(var3); % Number of runs
% Set up the design matrix, desMat
desMat = {var1,var2,var3};
[desMat{:}]=ndgrid(desMat{:});
n=length(desMat);
desMat = reshape(cat(n+1,desMat{:}),[],n);
if exist('./Results.csv', 'file')==2
delete('./Results.csv');
end
parfor kk = 1:NS
var1a = desMat(kk,1); var2a = desMat(kk,2); var3a = desMat(kk,3);
[out1 out2 out3] = Function_Pd(var1a,var2a,var3a);
vec = [var1a var2a var3a out1 out2 out3];
fileID = fopen('Results.csv','a');
fprintf(fileID,'%f %f %f %f %f %f\n',vec);
fclose(fileID);
end
2 Comments
i don't have a solution. YOu can probably just change
fileID = fopen('Results.csv','a');
to
fileID = fopen( sprintf('Results_%02d.csv',kk),'a');
to save things in different files and then assemble them afterward.
Also, you might check out the exchange if you're just using parallelization for speed without requiring any interim communincation: https://www.mathworks.com/matlabcentral/fileexchange/13775-multicore-parallel-processing-on-multiple-cores
Paul Safier
on 3 Aug 2022
Accepted Answer
More Answers (3)
Jeff Miller
on 3 Aug 2022
0 votes
Maybe have each worker write to its own output file and then assemble those after? This answer shows how to get the id for each worker.
3 Comments
Bruno Luong
on 4 Aug 2022
The same suggestion has been discussed in the comment below the question
Jeff Miller
on 4 Aug 2022
Oh, sorry, I thought that suggestion was to write one file for each iteration of the parfor loop rather than for each separate worker.
Paul Safier
on 4 Aug 2022
Raymond Norris
on 12 Aug 2022
@Paul Safier since the order of the file doesn't have to be deterministic, use a data queue to write back to the client and have the client write the csv file.
% Run a parametric study
var1 = (-60:0.5:60)';
var2 = (-110:0.5:110)';
var3 = (3.5:0.5:18.5)';
% Remove zero entries since their usage prohibited
var1(var1 == 0) = [];
var2(var2 == 0) = [];
var3(var3 == 0) = [];
NS = length(var1)*length(var2)*length(var3); % Number of runs
% Set up the design matrix, desMat
desMat = {var1,var2,var3};
[desMat{:}]=ndgrid(desMat{:});
n=length(desMat);
desMat = reshape(cat(n+1,desMat{:}),[],n);
if exist('./Results.csv', 'file')==2
delete('./Results.csv');
end
fileID = fopen('Results.csv','a');
D = parallel.pool.DataQueue;
afterEach(D,@(V)logger(fileID,V))
c = onCleanup(@()fclose(fileID));
parfor kk = 1:NS
var1a = desMat(kk,1); var2a = desMat(kk,2); var3a = desMat(kk,3);
[out1 out2 out3] = Function_Pd(var1a,var2a,var3a);
vec = [var1a var2a var3a out1 out2 out3];
send(D,vec)
end
function logger(fileID,vec)
fprintf(fileID,'%f %f %f %f %f %f\n',vec);
end
4 Comments
Paul Safier
on 12 Aug 2022
Paul Safier
on 19 Aug 2022
Raymond Norris
on 19 Aug 2022
@Paul Safier somewhere/how, you've already closed your file. I can reproduce your warning here
fileID = fopen('Results.csv','a');
c = onCleanup(@()fclose(fileID));
fprintf(fileID,"%f\n",rand);
fclose(fileID);
>> safier
>> clear
Warning: The following error was caught while executing 'onCleanup' class destructor:
Error using fclose
Invalid file identifier. Use fopen to generate a valid file identifier.
Error in safier>@()fclose(fileID) (line 2)
c = onCleanup(@()fclose(fileID));
Error in onCleanup/delete (line 23)
obj.task();
I wouldn't have gotten the warning if I hadn't called
fclose(fileID);
Paul Safier
on 23 Aug 2022
Alexander Denman
on 29 Jan 2023
0 votes
One option you might consider is using a database, for example PostgreSQL. Ensuring that concurrent writes don't interfere with each other is one of the core functions of a relational database management system.
To do this, you would install PostgreSQL on a machine that is reachable from all of your worker nodes, then create a table to hold the results. You can store essentially any matlab variable in a postgres "bytea" colum by using typecast(getByteStreamFromArray(someVariable),'int8') to convert the variable to one long stream of 8 bit integers.
Then, when each worker is ready to save its results, it opens a database connection using the 'database' function, uploads the results using 'sqlwrite' or 'datainsert', and then closes the connection.
When you retrieve the results from the database, you r-convert the data to its original form using getArrayFromByteStream(typecast(binaryDataFromDatabase),'uint8')).
1 Comment
Paul Safier
on 30 Jan 2023
Categories
Find more on Parallel Computing Fundamentals in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!