Is there a way to speed up for loop when grouped with GPU?

6 views (last 30 days)
Hello,
At the moment for loop is bottle neck in my code. I know that GPU does not work with indexing and due to for loop all calculations are switching memories between GPU and CPU. But maybe someone would have a suggestion how to speed up this part or this is stalemate and due to memory switching cant not optimized more. In my case lab (200000000x130), dydis (100000).
function [a,b]=skaicia (lab,dydis,z)
comi=gpuArray(0.05);
t=gpuArray(0.6);
d=gpuArray(50001);
langas=gpuArray(50000);
atidaryta=gpuArray(50000);
x1 = zeros(dydis,65,1);
for i=1:z
x1(:,:,i)=lab(i*dydis+1-dydis:i*dydis,:);
x=gpuArray(x1(:,:,i));
x23=x(1:end-d,:);
[n1,n2]=size(x);
n1=gpuArray(n1);
n2=gpuArray(n2);
xt=permute(x,[2 1 3]);
dx1=(d-langas-1:d-2);
dx=permute(dx1,[2 1])+ (1:n1-d);
[sujn1(:,:,i),sujn2(:,:,i)]=mazinta(xt,dx,n2,n1,d,x23,t,langas,atidaryta,comi);
end
a=sujn1;
b=sujn2;
end
  2 Comments
Walter Roberson
Walter Roberson on 31 Mar 2019
You are growing x1 dynamically along the third dimension -- you allocate it as dydis by 65 by 1, but you assign into x1(:,:,i) so it keeps getting larger.
You pull the part of x1 that you just assigned into a gpuArray that becomes x. You never use x1 again in your code other than continuing to grow it and copying the latest slice to gpu. You do not return x1.
Therefore it would be more efficient to directly do
x = gpuArray( lab(i*dydis+1-dydis:i*dydis,:) );
Mantas Vaitonis
Mantas Vaitonis on 31 Mar 2019
Thank you Walter, you suggestion was very helpful and did improve the speed of calculations.

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!