Improve Performance of Edge Detection Algorithm Using Automatic Parallelization
This example is designed to perform edge detection on a 2D image using a custom filter. It highlights the performance improvement that comes from using automatic parallelization of for
-loops, which is enabled by default in MATLAB® Coder™.
MATLAB Coder uses the OpenMP standard for parallel programming to efficiently distribute computations across multiple threads on a single machine. This parallelization leverages shared memory architecture to significantly speed up processing time. For more information on the parallel programming capabilities of OpenMP, see The OpenMP API Specification for Parallel Programming.
Prerequisites
To allow parallelization, ensure that your compiler supports the OpenMP. If the compiler does not support OpenMP, the code generator generates serial code.
To run the software-in-loop (SIL) verification mode, you must have an Embedded Coder license.
View Image and Filter Algorithm
The edge detection algorithm edgeDetectionOn2DImage.m
applies a Sobel filter to the input image flower.jpg.
The for
-loop computes the gradient magnitude at each pixel by applying the input filter. The resulting image is normalized to a range of 0 to 255 for display.
Display the input image flower.jpg
.
imshow("flower.jpg")
Display the MATLAB® code for edgeDetectionOn2DImage
.
type edgeDetectionOn2DImage
% This function accepts an image and a filter and returns an % image with the filter applied on it. function filteredImage = edgeDetectionOn2DImage(original2DImage, filter) %#codegen arguments original2DImage (:,:) double filter (:,:) double end %% Algorithm for convolution % Initialize filteredImage = zeros(size(original2DImage)); [rows, cols] = size(original2DImage); filterSize = size(filter); % Apply filter on input image through windowing technique for i = 3:rows-2 for j = 3:cols-2 % Compute the gradient components Gx_component = 0; Gy_component = 0; for u = 1:filterSize(1) for v = 1:filterSize(1) pixel_value = original2DImage(i+u-3, j+v-3); Gx_component = Gx_component + filter(u, v) * pixel_value; Gy_component = Gy_component + filter(u, v+filterSize(1)) * pixel_value; end end % Compute the gradient magnitude filteredImage(i, j) = hypot(Gx_component, Gy_component); end end % Normalize the output image maxPixel = max(filteredImage,[], 'all'); filteredImage = uint8(255 * (filteredImage / maxPixel)); end
Generate Parallel C Code
Generate parallel C code for the edgeDetectionOn2DImage
function.
% Load the image and convert to grayscale for processing image = double(rgb2gray(imread("flower.jpg"))); % Define the Edge Detection Filter filter = [-2 -1 0 1 2 -2 -2 -4 -2 -2; -2 -1 0 1 2 -1 -1 -2 -1 -1; -4 -2 0 2 4 0 0 0 0 0; -2 -1 0 1 2 4 1 2 1 1; -2 -1 0 1 2 5 2 4 2 2]; cfg = coder.config("lib"); cfg.VerificationMode = "SIL"; codegen edgeDetectionOn2DImage -args {image,filter} -config cfg -report
Code generation successful: View report
You can view the generated code by clicking View report. The OpenMP pragmas in the file edgeDetectionOn2DImage.c
indicate parallelization of the for
-loop.
To display the output image with filter applied, run the generated SIL MEX edgeDetectionOn2DImage_sil
.
out = edgeDetectionOn2DImage_sil(image,filter);
### Starting SIL execution for 'edgeDetectionOn2DImage' To terminate execution: clear edgeDetectionOn2DImage_sil
View the output image.
imshow(out)
Verify Numerical Correctness
You can verify the numerical correctness of the generated code by comparing its output with the MATLAB code output. Use the isequal
function to compare their outputs.
isequal(edgeDetectionOn2DImage(image,filter),edgeDetectionOn2DImage_sil(image,filter))
ans = logical
1
The returned value 1
(true
) verifies that the generated code behaves the same as the MATLAB code.
You can also compare the generated code with the MATLAB code for a specific target hardware by setting the configuration option VerificationMode
to PIL
.
Compare Performance
You can compare the performance using coder.perfCompare
of the code generated with automatic parallelization enabled and disabled along with MATLAB. These results are achieved when this example is run on a 12-core 64-bit Windows® platform.
Run these commands using the image and filter as defined above.
cfgOff = coder.config("lib"); cfgOff.VerificationMode = "SIL"; cfgOff.EnableAutoParallelization = false; cfgOn = coder.config("lib"); cfgOn.VerificationMode = "SIL"; coder.perfCompare("edgeDetectionOn2DImage",1,{image,filter},{cfgOff,cfgOn}, ... ConfigNames={"Automatic Parallelization Disabled","Automatic Parallelization Enabled"},CompareWithMATLAB=true);
==== Running (edgeDetectionOn2DImage, MATLAB) ==== - Running MATLAB script. TimingResult with 10 Runtime Sample(s) Statistical Overview: mean = 2.13e+00 s max = 2.23e+00 s sd = 5.46e-02 s median = 2.13e+00 s min = 2.05e+00 s 90th = 2.20e+00 s ==== Running (edgeDetectionOn2DImage, Automatic Parallelization Disabled) ==== - Generating code and building SIL/PIL MEX. - Running MEX. TimingResult with 10 Runtime Sample(s) Statistical Overview: mean = 5.64e-01 s max = 6.26e-01 s sd = 2.63e-02 s median = 5.61e-01 s min = 5.30e-01 s 90th = 6.04e-01 s ==== Running (edgeDetectionOn2DImage, Automatic Parallelization Enabled) ==== - Generating code and building SIL/PIL MEX. - Running MEX. TimingResult with 10 Runtime Sample(s) Statistical Overview: mean = 1.06e-01 s max = 1.21e-01 s sd = 1.30e-02 s median = 1.08e-01 s min = 8.53e-02 s 90th = 1.21e-01 s MATLAB Automatic Parallelization Disabled Automatic Parallelization Enabled _____________ ___________________________________ ___________________________________ Runtime (sec) Runtime (sec) Speedup wrt MATLAB Runtime (sec) Speedup wrt MATLAB _____________ _____________ __________________ _____________ __________________ edgeDetectionOn2DImage 2.1324 0.56065 3.8035 0.10766 19.807
The results from coder.perfCompare
show that the code generated with automatic parallelization disabled runs approximately 4 times faster than the MATLAB code. However, the code generated with automatic parallelization enabled runs approximately 20 times faster than the MATLAB code.