How can I use C++ to speed up my matlab code ?

Question

1 vote

Hi every one, I am working on image processing and I generated a matlab code with many for loops, it takes too long to display results, can any one please tell how can I speed it up using C++ ?

thank you !

5 Comments
Show 3 older comments Hide 3 older comments

Stephen23 on 7 Apr 2017

@mayssa: your is slow because you wrote it to expand the array of every iteration. Much easier than switching to C++ would be to simply write efficient MATLAB code:

https://www.mathworks.com/help/matlab/matlab_prog/preallocating-arrays.html

https://www.mathworks.com/help/matlab/matlab_prog/techniques-for-improving-performance.html

Mayssa on 7 Apr 2017

that's great! thank you !

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Jan on 7 Apr 2017

Edited: Jan on 7 Apr 2017

Open in MATLAB Online

1 vote

Sorry, I cannot resist :-)

MyCorrCoeff2 can still be improved by considering the columnwise storing of elements in arrays. The normalization happens repeatedly inside the inner loops, so it is cheaper to compute it once outside the loops:

function H = MyCorrCoeff3(Mat)
% Author: Jan Simon, License: CC BY-SA 3.0
c  = size(Mat,1);
nc = (0.5 * c * (c + 1));
H  = ones(nc, nc);  % Pre-allocate!!!
MP = permute(Mat, [3, 2, 1]);
MP = MP - sum(MP, 1) / size(MP, 1);
% MP = bsxfun(@minus, MP, sum(MP, 1) / size(MP, 1));
CovN = squeeze(sqrt(sum(MP .* MP, 1)));
jH = 1;
for i = 1:c
   for j = i:c
      cov1  = MP(:, j, i);
      cov1n = CovN(j, i);
      iH = 1;
      for d = 1:c
         for l = d:c
            if iH < jH    % Use symmetry of H
               % Emulate:  r = cov([Mat(i, j, :), Mat(d.l,:)]);
               r21  = (MP(:, l, d).' * cov1) / cov1n / CovN(l, d);
               H(iH, jH) = r21;
               H(jH, iH) = r21;
            end
            iH = iH + 1;  % Update indices related to H:
         end
      end
      jH = jH + 1;
   end
end
H(isnan(H)) = 0;

0.085 sec, 38 times faster than the original version. :-)

[07-Apr-2017 16:34 UTC] Enough for today. Now the code is in a form which could profit from C-coding. The bottleneck is still with about 85% of the total time:

r21 = (MP(:, d, l).' * cov1) / cov1n / CovN(d, l);

The vector multiplication is calculated in an optimized BLAS-library already. Therefore I think you can reduce the time for the overhead of calling the library and some percent of the loops only, perhaps 10%. It will be more useful to try a parallelization with PARFOR.

4 Comments
Show 2 older comments Hide 2 older comments

Stephen23 on 8 Apr 2017

+1 very nice work, Jan Simon.

Mayssa on 8 Apr 2017

@Jan:Indeed, I must admit that this was an opportunity for me to learn a lot; still too much to learn though. For the time problem, it is solved, it takes about five minutes now and to answer your question, yes this is wanted, Mat is already symmetric.

Sign in to comment.

Answer 2

John D'Errico on 6 Apr 2017

0 votes

Almost always using compiled code will not give you great speedups, UNLESS your code was inefficiently written in MATLAB in the first place. Of course then almost anything you do will be an improvement.

But if your code is already well written in MATLAB, using a good choice of algorithms, then trying to go to C or C++ will probably not be worth the time and effort it will take to code. This is certainly true unless you have quite good programming skills in C/C++, AND you understand the essential algorithms well.

The point is almost always the most impact you can get for a given investment of time will come from a better understanding of MATLAB, learning how to improve your code. Often great gains can come from a completely different algorithmic approach, rethinking how you will solve the problem.

3 Comments
Show 1 older comment Hide 1 older comment

John D'Errico on 7 Apr 2017

Edited: John D'Errico on 7 Apr 2017

Open in MATLAB Online

The slowness of your code is PURELY due to the fact that you never preallocated v !!!!!!!!! Writing c++ code here is a waste of time. Just write EFFICIENT MATLAB code. MATLAB is NOT "really slow". It is your code as written that is really slow!

When you append one element at a time to v, that forces MATLAB to reallocate the vector v in memory at EVERY iteration of the loop! It then needs to copy the entire array over. Small at first, but as v grows in size, this take serious time to do. This dynamic array growth is a bad idea. It is fine for tiny arrays, but as the arrays grow in size, not true. Instead, use the function zeros to fill v with elements of the final size of v.

You know (or can compute) what the final size of v will be, based on the value of c. Based on your loops, it will be:

c^2*(c+1)^2/4

So instead of setting v as an empty array to start with, use

v = zeros(1,c^2*(c+1)^2/4);

Inside your loop, maintain a counter that tells you where to put the next element of the vector. So your loops will look like this now:

c=size(Mat,1);
v = zeros(1,c^2*(c+1)^2/4);
ind = 0;
for i = 1:c
  for j =i:c
    for d=1:c
      for l =d:c
        ind = ind + 1;
        a = corrcoef(Mat(i,j,:),Mat(d,l,:));
        v(ind) = a(2,1);
      end
    end
  end
end

Could you have written those loops themselves more efficiently, with fewer loops? This is probably also true, but not the immediate problem.

In fact, the editor will actually have warned you of this need to preallocate v. On the right hand side, you will see red and orange lines in the editor, indicating potential problems in your code. Read what they say, and what they suggest.

Note that the very same comments apply to the array H. You never bothered to preallocate H to the proper size, but then you stuff one element at a time into H. Even worse, all you are really doing is a simple RESHAPE!!!!!! Thus, you are creating a new array that has the same elements, in the same order, but the array is a different shape. So you can replace the entire set of loops that create H with one line of efficient code.

H = reshape(v,c*(c+1)/2,c*(c+1)/2);

As I said in my answer, look to your own code for the problem. Writing c++ code to fix these problems would be silly, and a complete waste of time.

Mayssa on 7 Apr 2017

for the loops, I tried several times and couldn't see a better way to do it to obtain what I need. It turned out that most of the time was spent when calling corrcoef, I replaced it by another built function, it reduced time to more than half, so yes as you said :) you convinced me and thank you for the reshape (didn't know about it),

Sign in to comment.

Answer 3

Jan on 7 Apr 2017

Edited: Jan on 7 Apr 2017

Open in MATLAB Online

0 votes

After your comment:

Letting an array grow iteratively is a massive waste of resources. You should see a corresponding MLint warning in the editor.

c  = size(Mat,1);
nv = (0.5 * c * (c + 1)) ^ 2;  % sum(1:c)^2
v  = zeros(1, nv);
iv = 0;
for i = 1:c
   for j = i:c
       Matij = Mat(i, j, :);
       for d = 1:c
           for l = d:c
               a = corrcoef(Matij, Mat(d,l,:));
               iv    = iv + 1;
               v(iv) = a(2,1);
           end
       end
   end
end

For Mat = [20 x 20 x 10] this reduces the runtim from 5.3 to 4.0 sec already, but the benefit will be higher for larger inputs.

This can be accelerated, if you do not compute all elements by corrcoef, but only the needed one:

function r21 = mycorrel(A, B)
r   = cov([A(:), B(:)]);
r21 = r(2,1) / sqrt(r(1) * r(4));
% Exactly the same output as CORCOEFF without rounding differences:
% Slightly slower, but considers over/underflow:
% r21 = r(2, 1) / sqrt(r(1)) / sqrt(r(4));

and

... innermost loop:
           for l = d:c
               iv    = iv + 1;
               v(iv) =  mycorrel(Matij, Mat(d,l,:));
           end

2.5 seconds run time. :-)

The 2nd part: nchoosek is slow, so do not call it repeatedly:

k = 1;
nc = (0.5 * c * (c + 1));  % nchoosek(c,2)+c;
H  = zeros(nc, nc);  % Pre-allocate!!!
for i = 1:nc
   for j = 1:nc
       H(i,j) = v(k);
       k = k+1;
   end
end
H(isnan(H)) = 0;

What do you observe finally?

H(1:5, 1:5)

H is symmetric. This means that half of the computations are redundant. Then you can reduce the run time by 50% if you consider this in the first part already. Now the main diagonal can be ignored also. I leave this up to you, it is not hard to create the matrix H directly. Calculate the elements below the diagonal only and copy it above the diagonal. Something like:

for d = i:c  % Instead of 1:c
  ...
     H(x, y) = mycorrel...
     H(y, x) = H(x, y);

Conclusion: With some reordering and avoiding of unnecessary computations you can accelerate the code by a factor of 4 without the need to create a C-Mex function. Because the main work is spent in COV, a straight translation to C would not have been faster.

5 Comments
Show 3 older comments Hide 3 older comments

Mayssa on 7 Apr 2017

Edited: Mayssa on 7 Apr 2017

@Jan: I think reshape does the job better for H as mentioned above, don't you think ? Indeed it is faster but my problem is still with the main which takes near half an hour (well better than 1 hour yesterday), that's why I thought of creating separate C functions in the first place; I have seen this in a clustering program, although a very much larger data than mine is used, the code doesn't take more than 1 min.

Mayssa on 7 Apr 2017

@Stephen: thank you for the link, greatly appreciated

Sign in to comment.

Answer 4

Jan on 7 Apr 2017

Edited: Jan on 7 Apr 2017

Open in MATLAB Online

0 votes

function H = MyCorrCoeff1(Mat)
c  = size(Mat,1);
nc = (0.5 * c * (c + 1));
H  = zeros(nc, nc);  % Pre-allocate!!!
iH = 1;
jH = 1;
for i = 1:c
   for j = i:c
       Matij = Mat(i, j, :);  % Tiny speedup to copy this once only
       for d = 1:c
           for l = d:c
              if iH == jH     % Diagonal element is 1.0
                 H(iH, jH) = 1;
              elseif iH > jH  % Use symmetry of H
                 % Equivalent:
                 % r = corrcoef(Mat(i,j,:),Mat(d,l,:)); r21 = r(2, 1);
                 Matdl = Mat(d,l,:);
                 r     = cov([Matij(:), Matdl(:)]);
                 r21   = r(2,1) / sqrt(r(1)) / sqrt(r(4));
                 H(iH, jH) = r21;
                 H(jH, iH) = r21;
              end
              % Update indices related to H:
              iH = iH + 1;
              if iH > nc
                 iH = 1;
                 jH = jH + 1;
              end
           end
       end
   end
end
H(isnan(H)) = 0;

This reduces the runtime from 3.26 to 0.63 sec (R2016b/64, Win7).

Now COV is still the bottleneck, but is called less often. Let's see if inlining the code of COV helps (in addition H is initialized with 1 and the diagonal is not set inside the loops):

function H = MyCorrCoeff2(Mat)
c  = size(Mat,1);
nc = (0.5 * c * (c + 1));
H  = ones(nc, nc);  % Pre-allocate!!!
covM = size(Mat, 3);
iH = 1;
jH = 1;
for i = 1:c
   for j = i:c
       Matij = Mat(i, j, :);
       cov1  = Matij(:) - sum(Matij(:)) / covM;  % Remove mean
       cov1n = sqrt(cov1.' * cov1);
       for d = 1:c
           for l = d:c
              if iH > jH  % Use symmetry of H
                 Matdl = Mat(d,l,:);
                 % Emulate:  r = cov([Matij(:), Matdl(:)]);
                 cov2 = Matdl(:) - sum(Matdl(:)) / covM;
                 r21 = (cov2.' * cov1) / cov1n / sqrt(cov2.' * cov2);
                 H(iH, jH) = r21;
                 H(jH, iH) = r21;
              end
              iH = iH + 1;    % Update indices related to H:
              if iH > nc
                 iH = 1;
                 jH = jH + 1;
              end
           end
       end
   end
end
H(isnan(H)) = 0;

Yepp, it is faster: 0.24 sec. 13.6 times faster, and still no need to struggle with tricky C code (and as you can see from my FileExchange submissions, I do love C!).

But look at the code: It is ugly! The original code was much easier to understand. Here a clean version:

c  = size(Mat,1);
nc = (0.5 * c * (c + 1));  % sum(1:c)
H  = zeros(nc, nc)
iH = 0;
for i = 1:c
  for j = i:c
    for d = 1:c
      for l = d:c
        a     = corrcoef(Mat(i, j, :), Mat(d,l,:));
        iH    = iH + 1;
        H(iH) = a(2,1);
      end
    end
  end
end

The results are equal except for rounding differences. This nicer version should be included as comment.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

How can I use C++ to speed up my matlab code ?

5 Comments
Show 3 older comments Hide 3 older comments

Accepted Answer

4 Comments
Show 2 older comments Hide 2 older comments

More Answers (3)

3 Comments
Show 1 older comment Hide 1 older comment

5 Comments
Show 3 older comments Hide 3 older comments

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Tags

Community Treasure Hunt

How can I use C++ to speed up my matlab code ?

5 Comments Show 3 older comments Hide 3 older comments

Accepted Answer

4 Comments Show 2 older comments Hide 2 older comments

More Answers (3)

3 Comments Show 1 older comment Hide 1 older comment

5 Comments Show 3 older comments Hide 3 older comments

0 Comments Show -2 older comments Hide -2 older comments

Categories

Tags

See Also

Community Treasure Hunt

5 Comments
Show 3 older comments Hide 3 older comments

4 Comments
Show 2 older comments Hide 2 older comments

3 Comments
Show 1 older comment Hide 1 older comment

5 Comments
Show 3 older comments Hide 3 older comments

0 Comments
Show -2 older comments Hide -2 older comments