Code is running very slow, how to make faster?

25 views (last 30 days)
Fego Etese
Fego Etese on 24 Jul 2020
Edited: per isakson on 10 Aug 2020
Hello,
Please i am having issue with getting this code to run faster, i am using it for my final year dissertation. I will really appreciate your help on it if you can help me optimize the code to be faster.
The code is for Zero Normalized Cross Correlation which i want to use for template matching. I have attached all the files including a screenshot of my profiling of the code which is very slow. I have also attached the workspace needed. Everything is in the link above.
Other things you need to know for running the code are in the "Recommended Instruction for Executing Code.txt" file.
I will really appreciate it if you can help me.
Thanks a lot
  4 Comments
Mario Malic
Mario Malic on 25 Jul 2020
Edited: Mario Malic on 29 Jul 2020
Is double necessary?
% convert to single
target = single(matchImg);
template = single(enrollTemplateImage);
Change them both to single gives some improvement.
>> tic, posZNCC = znccPrf(enrollTemplateImage); toc % with double, .png image
Elapsed time is 1.675404 seconds.
>> tic, posZNCC = znccPrf(enrollTemplateImage); toc % with single, .png image
Elapsed time is 1.418316 seconds.
Also, the fact that you run your code in OneDrive folder, maybe .mlx generates some files while running and the OneDrive sync produces the problems.
Also, consider the difference in outputs of matchImg when you are supplying it with different images. Finger1A will not generate same output as Finger1E even though they are the same format.
Fego Etese
Fego Etese on 25 Jul 2020
Thank you so much Mario Malic for your help.
I'll test out the single and see if it givesa ny improvements. Sorry for my late reply, I've been away from my computer for a while now.
I don't run the code from my Onedrive normally, I just uploaded it there so i can share it here. I will also take up yur siggestion and convert the images to png. Thanks so much

Sign in to comment.

Answers (1)

per isakson
per isakson on 25 Jul 2020
Edited: per isakson on 7 Aug 2020
Caveat: I've never seriously used the Live Editor.
I've undertaken the following steps
  1. uploaded your files to a new folder, which I made the current folder
  2. read "Recommended Instruction for Executing Code.txt" file.
  3. loaded workspace.mat
  4. converted znccPrf.mlx to znccPrf.m (an old time m-file)
  5. changed imread('finger1E.tif'); to imread('finger1E.png'); since there was no tif-file in the upload.
  6. profiled posZNCC = znccPrf(enrollTemplateImage);. The statement, meanRef=mean(mean(ref)); dominated together with "self time".
  7. replaced mean(mean(ref)); by mean(ref,'all');. That helped a bit. And sum(reshape(ref,1,[]))/numel(ref); is still a bit faster.
Finally I run
>> tic, posZNCC = znccPrf(enrollTemplateImage); toc
Elapsed time is 2.003947 seconds.
>> tic, posZNCC = znccPrf(enrollTemplateImage); toc
Elapsed time is 2.024163 seconds.
The code I ran differs from the code of your profiling screenshots. The profiling results differs dramatically.
I use Matlab R2018b, Win10 and a fairly new desktop PC.
In response to comments
I've made a few more changes to your code and achieved close to a doubling of the speed compared to your function.
I use the uploaded png-file, finger1E.png, in both cases. (Is the code intended to process png or tif files?)
Furthermore, I use the lines
for y = 1:rTem % <<<<<<<<<<<<
for x = 1:cTem % <<<<<<<<<<<<
in both functions, since I believe that's the relevant case. Why do you use " = 1:2 " in some cases?
The script
%%
tic, posZNCC = znccPrf( enrollTemplateImage ); toc
tic, posZNCC_poi = znccPrf_poi( enrollTemplateImage, 'png' ); toc
posZNCC, posZNCC_poi
%%
t = bench()
outputs
>> fego
Elapsed time is 2.611501 seconds.
Elapsed time is 1.430623 seconds.
posZNCC =
209 103 0.76105
posZNCC_poi =
1×3 single row vector
209 103 0.76105
t =
0.081743 0.078711 0.01291 0.083046 1.2827 2.0448
The two return the same value of posZNCC, that is within the precision displayed by format short. The last line describes the performance of my Matlab+PC. The first four numbers are good the last two are poor.
Measures to improve the speed
Use single instead of double. It introduces rounding errors, which I believe are acceptable.
% convert to double
target = single(matchImg); % <<<<<<<<<<<<
template = single(enrollTemplateImage); % <<<<<<<<<<<<
Split the calculation of the temporary variable, ref, into two steps. This should decrease the need for shuffling data.
for jj = 1 : (rTar - rTem + 1)
refjj = target( jj:(jj+rTem-1), : ); % <<<<<<<<<<<<
for ii = 1 : (cTar - cTem + 1)
ref = refjj( :, ii:(ii+cTem-1) ); % <<<<<<<<<<<<
Chose a more efficient code to calculate mean of a matrix. In a reply to Walter's question I showed a comparison between six different ways to calculate the mean.
meanRef = sum(ref(:))/numelTem; % <<<<<<<<<<<<
Vectorize the two inner loops
tmT = template - meanTem; % <<<<<<<<<<<<
rmR = ref - meanRef; % <<<<<<<<<<<<
sum1 = sum( reshape( tmT.*rmR, [],1 ) ); % <<<<<<<<<<<<
sum2 = sum( reshape( tmT.*tmT, [],1 ) ); % <<<<<<<<<<<<
sum3 = sum( reshape( rmR.*rmR, [],1 ) ); % <<<<<<<<<<<<
ZNCC = sum1 / (sqrt(sum2) * sqrt(sum3)); % <<<<<<<<<<<<
That was a lot of work and it didn't even double the speed. (The two functions are attached.)
One more measure (2020-08-07)
The execution time of the script** increases faster than linear with the size of the image, i.e. with the size of the variable matchImg in the code. The image, finger1E.png, has a fairly large white areas to the left and right. Removing most of that white area decreases the execution time substantially without affecting the result.
I made this little test
>> pic = 'finger1E';
>> crop = false;
>> tic, [ posZNCC, P ] = znccPrf_poi_v2( enrollTemplateImage, pic, crop ); toc
Elapsed time is 1.413413 seconds.
>> crop = true;
>> tic, [ posZNCC, P ] = znccPrf_poi_v2( enrollTemplateImage, pic, crop ); toc
Elapsed time is 0.802015 seconds.
All of the measure described above are implemented in znccPrf_poi_v2. With crop==false the elapse time, 1.41sec, is close enough to 1.43sec reported above for znccPrf_poi. With crop==true the leftmost 90 and rightmost 58 columns of the 374x388 matchImg are removed by
matchImg = imread('finger1E.png');
if nargin==3 && crop
matchImg = matchImg( :, 91:330 );
end
**) should be function
  35 Comments
per isakson
per isakson on 10 Aug 2020
You need to approach the problem more systematically. Don't cut corners.
What do you know for sure?
What could be the reasons for the poor performance? There were some good hypotheses in a recent comment (now deleted) by Mario Malic. Make your own list and test one hypothesis at a time. Document the results.
znccPrf_poi_v2 differs from znccPrf_poi only regarding the cropping of white area. Thus, I think it's better that you concentrate on reproducing my result with znccPrf_poi.
per isakson
per isakson on 10 Aug 2020
Edited: per isakson on 10 Aug 2020
I looked at your screenshots of the for ii = 1 : (cTar - cTem + 1) loop. I noticed that the number of iterations differs between the first and the rest. I find nothing in your July 28 comments that explains the difference.
  • Fego Etese on 28 Jul 2020 shows 96672 iterations
  • Fego Etese on 28 Jul 2020 shows 332576 iterations
  • Fego Etese on 2 Aug 2020 shows 332576 iterations
  • Fego Etese on 7 Aug 2020 shows 332576 iterations
The screenshot of my 7 Aug 2020 comment shows 96672 iterations.
P.S. The dates are the dates displayed here. Local time may add or substract a day.

Sign in to comment.

Categories

Find more on MATLAB Report Generator in Help Center and File Exchange

Products


Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!