How to speed up my code containing several for loops

I have written a code which aims to compute and plot a function called Qd(f), with f is the frequency vector. This function is given by the following expression:
Eq_Qd_corrected.png
Where PN is a Pseudo random vector of 1’s and -1’s of length L. This vector is simply generated with an independent function. Here, I will give directly the vector PN.
For practical reasons k abd k' are integers in the interval [-5*L, 5*L].
f is a frequency vector in the form of with n integer. Depending on problem constraints, PN can takes several lengths.
I have written the code for Qd(f) for L=31 using several for loops. The code works correctly but it takes several hours to display the result. I want to run my code for different length of PN, L=31, 63, 127, 255, 511, 1024, 2047 and 4095. The problem arises when L takes much important values, for example L=511 and higher. The code takes much more time (more than one day without result !).
My goal is to vectorize the inner loops in order to reduce the time of execution of the code. Below is the PN vector of length 31:
PN=[ 1 1 1 1 1 -1 -1 1 1 -1 1 -1 -1 1 -1 -1 -1 -1 1 -1 1 -1 1 1 1 -1 1 1 -1 -1 -1];
I would really appreciate any kind of help :)
Thank you in advance.

4 Comments

Where is the code? DId you try to use meshgrid?
Thank you Darova for your response. Sorry for the delay. In fact, I have written several codes and done many modifications to compute the expression above, so it took me some time to fix my programs and to post the more convenient ones to my question. Here, I will post three codes. The first one uses a function called impdirac to represent the Dirac delta function (i.e., impdirac(x)=1 if x=0, and 0 otherwise).
% fonction impulsion dirac
function y = impdirac( t )
% ===========================================================
if (t==0)
y=1;
else
y=0;
end
This program is not optimized and is only given for explanatory purposes.
clear all;
% PN = pngenfcn( 5, 2, 5 ); % This function generates a Pseudo-noise Sequence of length (2^5)-1=31 chips
PN=[ 1 1 1 1 1 -1 -1 1 1 -1 1 -1 -1 1 -1 -1 -1 -1 1 -1 1 -1 1 1 1 -1 1 1 -1 -1 -1];
L=length(PN);
V0=2; % Amplitude of the sequence
Tc=1e-8; % Chip period
fc1=1/Tc; % Chip sequence
fc2=0.99*fc1;
tic
k=-5*L:5*L;
f=k*(fc1-fc2)/L; % frequency vector
M=length(f);
Qd=zeros(1,M);
for m=1:M
X1=0;
for k1=-5*L:5*L
for k2=-5*L:5*L
if k1~=-k2
X2=0;
for l=1:L
for p=1:L
X2=X2+PN(l)*PN(p)*exp(-j*2*pi*((k1*l)+(k2*p))/L);
end
end
X1=X1+impdirac(f(m)-(((fc1*k1)+(fc2*k2))/L)).*(sinc(k1/L))*(sinc(k2/L))*exp(-j*pi*(k1+k2)/L).*X2;
end
end
end
Qd(m)=((V0/L)^2)*X1;
end
stem(f/(fc1-fc2),abs(Qd))
toc
The second code is more faster and it doesn’t run over all values of (k,k’). It searches, for a given value of f, only the set couples (k,k’) that satisfies :
Eq_delta_condition.png
Then it evaluates the double summation only for these set of couples which reduce the computational effort:
clear all;
% PN = pngenfcn( 5, 2, 5 ); % This function generates a Pseudo-noise Sequence of length (2^5)-1=31 chips
PN=[ 1 1 1 1 1 -1 -1 1 1 -1 1 -1 -1 1 -1 -1 -1 -1 1 -1 1 -1 1 1 1 -1 1 1 -1 -1 -1];
L=length(PN);
V0=2; % Amplitude of the sequence
Tc=1e-8; % Chip period
fc1=1/Tc; % Chip sequence
fc2=0.99*fc1;
tic
k=-5*L:5*L;
f=k*(fc1-fc2)/L; % frequency vector
M=length(f);
% Here we compute fk=((fc1*k1)+(fc2*k2))/L for all k1 and k2 such as:
% (fk/(fc1-fc2)) >= -5) && ((fk/(fc1-fc2)) <= 5
f1=[];
K1=[];
K2=[];
for n1=-5*L:5*L
for n2=-5*L:5*L
if n1~=-n2
fk=((fc1*n1)+(fc2*n2))/L;
if (((fk/(fc1-fc2)) >= -5) && ((fk/(fc1-fc2)) <= 5))
f1=[f1 fk];
K1=[K1 n1];
K2=[K2 n2];
end
end
end
end
% here we search, for a given value of f, only the set couples
% (k,k’) that satisfies fk=f, then we compute the sum only for these set of couples
Qd1=zeros(1,length(f));
for m=1:length(f)
indices=find(f1==f(m));
Xd=0;
for n=1:length(indices)
X1=0;
for l=1:L
for p=1:L
X1=X1+PN(l)*PN(p)*exp(-j*2*pi*((K1(indices(n))*l)+(K2(indices(n))*p))/L);
end
end
Xd=Xd+X1*((sinc(K1(indices(n))/L))*(sinc(K2(indices(n))/L))*exp(-j*pi*(K1(indices(n))+K2(indices(n)))/L));
end
Qd1(m)=((V0/L)^2)*(Xd);
end
stem(f/(fc1-fc2),abs(Qd1))
toc
Also I tried to vectorize the inner for loops using the function meshgrid, the code executes correctly for L=31, 63, 127, but if I works with longer sequences Matlab returns the error “Help memory”. This code is given below.
clear all;
% PN = pngenfcn( 5, 2, 5 ); % This function generates a Pseudo-noise Sequence of length (2^5)-1=31 chips
PN=[ 1 1 1 1 1 -1 -1 1 1 -1 1 -1 -1 1 -1 -1 -1 -1 1 -1 1 -1 1 1 1 -1 1 1 -1 -1 -1];
L=length(PN);
V0=2; % Amplitude of the sequence
Tc=1e-8; % Chip period
fc1=1/Tc; % Chip sequence
fc2=0.99*fc1;
tic
k=-5*L:5*L;
f=k*(fc1-fc2)/L;
M=length(f);
[k1,k2]=meshgrid(k);
fk=((fc1*k1)+(fc2*k2))/L;
fk((fk/(fc1-fc2)) < -5)=Inf;
fk((fk/(fc1-fc2)) > 5)=Inf;
D=Inf*ones(1,length(k));
DD=diag(D);
DD=rot90(DD);
fk=DD+fk; % Les valeurs correspondantes à k1=-k2 sont remplcées par Inf
AL=1:L;
[l,p]=meshgrid(AL);
[PNl,PNp]=meshgrid(PN);
Qd=zeros(1,M);
for m=1:M
indices=find(fk==f(m));
Xd=0;
for n=1:length(indices)
A1=exp(-1i*2*pi*((k1(indices(n))*l)+(k2(indices(n))*p))/L);
X1=sum(sum(PNl.*PNp.*A1));
Xd=Xd+X1*((sinc(k1(indices(n))/L))*(sinc(k2(indices(n))/L))*exp(-1i*pi*(k1(indices(n))+k2(indices(n)))/L));
end
Qd(m)=((V0/L)^2)*(Xd);
end
stem(f/(fc1-fc2),abs(Qd))
toc
I wonder if there is another way to conceive the code, for example using the function bsxfunc to compute the double summation.
Many thanks.
I would really appreciate any kind of help or comments on theses codes :)
You have to handle 25*L^4 number of operations. I'm even afraid to count what number it is
For L=1000 number is 25e12 (number of operations). So i'm not suprised that it takes a lot of time
  • I works with longer sequences Matlab returns the error “Help memory”
I don't know if it can be vectorised as MATLAB tries to preallocate all those numbers in memory
Thank you Darova for your response.
I am trying to find another solution to my problem. As you said the number of operations and the size of matrices are very important!

Sign in to comment.

 Accepted Answer

You can easily vectorise part of the inner sum computation:
%precompute before any loop:
[i, iprime] = ndgrid(1:L);
PNprod = PN(i) .* PN(iprime);
%... inside the k and k' loop:
innersum = sum(PNprod .* exp(-j*2*pi/L * (k*i + kprime*iprime)), 'all'); %'all' option requires 2018b or later
Whether or not that will help, I'm not sure. The outer k and k' loops are probably the ones that slow down everything due to the shear number of iterations.

1 Comment

Thank you Guillaume for your answer.
Unfortunately I actually work with MATLAB R2016b, but I will upgrade to the latest version as soon as possible.
Thanks for your help.

Sign in to comment.

More Answers (0)

Products

Asked:

on 30 Oct 2019

Commented:

on 6 Nov 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!