Handling memory when working with very huge data (.mat) files.
Show older comments
I am working with two 5D arrays (A5D and B5D) saved in a big_mat_file.mat file. The size of these arrays is specified in the code below. The total size of big_mat_file.mat file is around 20GB. I want to perform three simple operations on these matrices, as shown in the code. I have access to my university's computing cluster. When I run the following code with 120 workers and 400GB of memory, I receive the following error
In distcomp/remoteparfor/handleIntervalErrorResult (line 245) In distcomp/remoteparfor/getCompleteIntervals (line 395) In parallel_function>distributed_execution (line 746) In parallel_function (line 578)
Can someone please help me understanding what is causing this error. Is it because of low memory? It there anyother way to do the following operattions?
clear; clc;
load("big_mat_file.mat");
% it has two very huge 5D arrays "A5D" and "B5D", and two small arrays "as" and "bs"
% size of both A5D and B5D is [41 16 8 80 82]
% size of "as" is [1 80] and size of "bs" is [1 82]
xs = -12:0.1:12;
NX = length(xs);
ys = 0:0.4:12;
NY = length(ys);
total_iterations = NX * NY;
results = zeros(total_iterations , 41 , 16, 8);
XXs = zeros(total_iterations, 1);
YYs = zeros(total_iterations, 1);
parfor idx = 1:total_iterations
[ix, iy] = ind2sub([NX, NY], idx);
x = xs(ix);
y = ys(iy);
term1 = 1./(exp(1/y*(A5D-x)) + 10); %operation 1
to_integrate = B5D.*term1; %operation 2
XXs(idx) = x;
YYs(idx) = y;
results(idx, :, :, :) = trapz(as,trapz(bs,to_integrate,5),4); %operation 3
end
XXs = reshape(XXs, [NX, NY]);
YYs = reshape(YYs, [NX, NY]);
results = reshape(results, [NX, NY, 41, 16, 8]);
clear A5D B5D
save('saved_data.mat','-v7.3');
Accepted Answer
More Answers (1)
Sam Marshalik
on 30 Aug 2024
0 votes
You are likely running out of memory on the workers. You are not using sliced input variables (Sliced Variables - MATLAB & Simulink (mathworks.com) to access the 5D matrices and are sending the entire copy to each worker. They are likely big enough that you are running out of memory on those machines. I would suggest to run less workers (to give them access to more memory per worker), try using sliced input variables and pass only part of the matrix to the workers, or run on machines with more memory.
To test this theory, you can run your work and monitor memory usage on those machines - if this is the issue, you should see it max out.
1 Comment
Luqman Saleem
on 31 Aug 2024
Categories
Find more on Parallel for-Loops (parfor) in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!