Efficient use of matfile within parfor

I have a very simple code that works pretty well when the size of the involved objects is manageable by the memory.
%%% Variables
% Z -> matrix of dimensions (r,T)
% V -> matrix of dimensions (r,r,T)
% C -> matrix of dimensions (n,r)
% nanY -> matrix of dimensions (n,T)
% y -> matrix of dimensions (n,T)
R_new_T = zeros(n,1);
parfor t=1:T
yC = diag(~nanY(:,t))*C;
R_new_T = R_new_T + (y(:,t)-yC*Z(:,t+1)).^2 ...
+ diag(yC*V(:,:,t+1)*yC') ...
+ Rvec.*nanY(:,t);
end
In practical application, that bunch of code has to run with the parameter n taking values around 80k leading quickly to an unfeasible memory load. The solution I figured out relies on matfiles, at the expenses of paralellizionation and general performance. This is the version with which I am able to handle it for large n.
yC = zeros(n,r);
save var_cyc.mat nanY y yC -v7.3;
v = matfile('var_cyc.mat', 'Writable', true);
R_new_T = zeros(n,1);
for t=1:T
v.yC = diag(~v.nanY(:,t))*C;
R_new_T = R_new_T + (v.y(:,t) - v.yC*Z(:,t+1)).^2 ...
+ diag(v.yC*V(:,:,t+1)*v.yC') ...
+ Rvec.*(v.nanY(:,t)) ;
end
Unfortunately this script cannot be paralellizable for classification problems related to the variable v. I'm wondering if there is a way to recover the initial parallelization structure using matlab file to handle huge file without incurring in 'out of memory errors'.
Thanks.

Answers (1)

The problem is all to do with your assignment into v.yC. The parfor analysis cannot tell that this is safe and not order-dependent - it doesn't consider the fact that you're completely overwriting one field of the object. So, I think the simplest workaround is not to place yC in the matfile, and instead keep this as a loop temporary variable. In other words:
parfor t=1:T
yC = diag(~v.nanY(:,t))*C;
R_new_T = R_new_T + (v.y(:,t) - yC*Z(:,t+1)).^2 ...
+ diag(yC*V(:,:,t+1)*yC') ...
+ Rvec.*(v.nanY(:,t)) ;
end
This shouldn't cause any additional memory usage since each worker already needs a full copy of yC in memory in order to perform the other operations.

8 Comments

I tried your solution
parfor t=1:T
yC = diag(~v.nanY(:,t))*C;
R_new_T = R_new_T + (v.y(:,t) - yC*Z(:,t+1)).^2 ...
+ diag(yC*V(:,:,t+1)*yC') ...
+ Rvec.*(v.nanY(:,t)) ;
end
I get the following error.
"Conversion to double from logical is not possible."
As if yC is imported in double format to save space (NOTE: nanY(:,t) is a vector of 0s and 1s) and it is impossible to convert it to double and get the multiplication done.
Moreover, what if R_new_T becomes too large leading to out of memory error? What are possible strategies to write on a mat-file as an output of a parfor.
For example, the following code leads to errore due to parallelization:
yC = zeros(n,r);
R_new_T = zeros(n,1);
save var_cyc.mat nanY y yC R_new_T -v7.3;
v = matfile('var_cyc.mat', 'Writable', true);
parfor t=1:T
v.yC = diag(~v.nanY(:,t))*C;
v.R_new_T = v.R_new_T + (v.y(:,t) - v.yC*Z(:,t+1)).^2 ...
+ diag(v.yC*V(:,:,t+1)*v.yC') ...
+ Rvec.*(v.nanY(:,t)) ;
end
I can't work out where the conversion error is coming from - it's not clear why parfor should be different from for in that regard.
R_new_T shouldn't be growing in size as you proceed - you're preallocating it to be a vector, and the repeated additions aren't changing the size of it are they?
No, repeated additions aren't changing the size. But I need to generalize the loop where the results have to be stored in a (n,n) matrix.
You didn't answer yet
Please could you post a simple (but representative) standalone example that demonstrates the problem.
To have a standalone code, you just need to simulate the initial objects.
Here it is:
% Initialize data at random
n = 60000;
r = 3;
T = 100;
Z = randn(r,T);
V = randn(r,r,T);
C = rand(n,r);
y = randn(n,T);
y(y==0) = NaN;
% Run the code
yC = zeros(n,r);
R_new_T = zeros(n,1);
save var_cyc.mat nanY y yC R_new_T -v7.3;
v = matfile('var_cyc.mat', 'Writable', true);
parfor t=1:T
v.yC = diag(~v.nanY(:,t))*C;
v.R_new_T = v.R_new_T + (v.y(:,t) - v.yC*Z(:,t+1)).^2 ...
+ diag(v.yC*V(:,:,t+1)*v.yC') ...
+ Rvec.*(v.nanY(:,t)) ;
end
This code is missing nanY and Rvec, so I added them, like this:
% Initialize data at random
n = 60000;
r = 3;
T = 100;
Z = randn(r,T);
V = randn(r,r,T);
C = rand(n,r);
y = randn(n,T);
y(y==0) = NaN;
% adding nanY and Rvec
nanY = isnan(y); % ??
Rvec = zeros(n,1); % ??
% Run the code
R_new_T = zeros(n,1);
save var_cyc.mat nanY y -v7.3;
clear y nanY
v = matfile('var_cyc.mat');
parfor t=1:(T-1)
nanYCol = v.nanY(:,t);
yC = nanYCol.*C;
R_new_T = R_new_T + (v.y(:,t) - yC*Z(:,t+1)).^2 ...
+ diag(yC*V(:,:,t+1)*yC') ...
+ Rvec.*nanYCol ;
end
This works, but it uses a lot of memory. The problem is essentially that each worker needs to make an n-by-n temporary. I was able to remove the first one of those by using yC = nanYCol .* C rather than forming the diagonal matrix. However, the second line of the computation of R_new_T also forms an n-by-n temporary only to extract the diagonal. It might be worth trying to refactor that so that it doesn't need the whole matrix.
Thanks. It should work. Just for consistency yC definition need a correction
yC = (~nanYCol).*C;
The second line can refactored using only matrices of dimension (n,r)
sum((yC*V(:,:,t+1)).*yC,2)

Sign in to comment.

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Asked:

on 27 Apr 2020

Commented:

on 5 May 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!