Generate and Deploy Optimized Code for Interpolated FIR Filter on Raspberry Pi using ARM Cortex-A CMSIS CRL
This example shows how to generate and deploy optimized code for an interpolated finite impulse response filter (IFIR) on a Raspberry Pi® target using ARM Cortex-A® CMSIS code replacement library (CRL) in Simulink(R). Additionally, this example uses SIL/PIL (software in loop/processor in loop) manager application to conduct simulations, which collect the execution time metrics of the generated code and compare the execution time of the generated code with that of the plain C code.
This example uses ARM Cortex-A CMSIS CRL to generate optimized code, but you can also use the Neon V7 instruction set to generate optimized code. For more information on this workflow, see Use Target Hardware Instruction Set Extensions to Generate SIMD Code from Simulink Blocks for ARM Cortex-A Processors (DSP System Toolbox).
Design IFIR Filter for Lowpass Response
IFIR consists of FIR Decimation (DSP System Toolbox), Discrete FIR Filter, and FIR Interpolation (DSP System Toolbox) blocks. The FIR Decimation block downconverts the input signal to a lower sampling rate. The FIR Filter block filters the signal, and the FIR Interpolation block restores the sampling rate of the filtered output to the original sampling rate of the input signal.
To open the model, execute the following command:
mdl = 'ifir_example';
open_system(mdl);
Set passband ripple to 0.005 dB, stopband attenuation to 80 dB, interpolation factor to 7, passband edge frequency to 0.1 π rad/sample, and stopband edge frequency to 0.101 π rad/sample.
APass = 0.005; % dB AStop = 80; % dB FStop = .101; M = 7; F = [.1 FStop];
Use convertmagunits
function to convert the passband ripple and stop band attenuation from dB to the linear scale.
A = [convertmagunits(APass,'db','linear','pass') convertmagunits(AStop,'db','linear','stop')];
Use the ifir
(DSP System Toolbox) function to get the coefficients of the FIR Decimation g(z)
, FIR filter h(z)
, and FIR interpolation g(z)
for the specified lowpass response parameters. The ifir
(DSP System Toolbox) function designs a periodic filter h(z)
, which provides the coefficients for the Discrete FIR Filter block. It also designs an image-suppressor filter g(z)
, which provides the coefficients for the FIR Decimation and FIR Interpolation blocks shown in the below model.
[h,g] = ifir(M,'low',F,A);
The code to compute h(z) and g(z) is set in the PreLoadFcn
of the model as the FIR decimation, FIR Interpolation and FIR Filter blocks use these coefficients as parameters. To open PreLoadFcn
, follow these steps:
In the Simulink Toolstrip, on the Modeling tab, in the Design gallery, click Property Inspector.
With no selection at the top level of the model, on the Properties tab, in the Callbacks section, select
PreLoadFcn
.
To distinguish the performance metrics of IFIR Filter in the execution profile report, you create an atomic subsystem consisting of the FIR decimation, FIR Interpolation and FIR Filter blocks. To create a subsystem, you select the block, right click and click on the option Create Subsystem from Selection. To make the subsystem atomic, you select the subsystem block, go to the Subsystem Block tab and click on Atomic Subsystem.
Simulate the IFIR model by running these commands. Set the default simulation time to 100s. View the noisy input signal and the interpolated FIR filter output in the spectrum analyzer.:
set_param(mdl,'SimulationMode', 'normal'); sim(mdl);
Configure IFIR Simulink Model to Generate Optimized Code
You can configure the Simulink model either interactively, using the Model Configuration Parameters UI, or programmatically, through the MATLAB command line interface.
Configure using Model Configuration Parameters UI
In the Apps tab of the Simulink model toolstrip, click the Embedded Coder app. In the C Code tab that opens, click Settings.
In the Hardware Implementation pane, set the Hardware board parameter to Raspberry Pi
.
In Code Generation pane:
Set the System target file to ert.tlc.
Set the Build configuration to Faster Runs to prioritize execution speed.
Under Code Generation, in the Interface pane, set the code replacement libraries to ARM Cortex-A CMSIS.
To see which block triggers code replacement, you can set the following options under Code Generation, in the Report pane:
Enable the Create code generation report.
Enable the Open report automatically.
Enable the Summarize which blocks triggered code replacements
Configure using MATLAB Command Line Interface
Alternatively, you can set all the configurations using set_param
commands.
Set the Hardware Board parameter to Raspbery Pi
based on your hardware.
set_param('ifir_example','HardwareBoard','Raspberry Pi');
Select ert.tlc
as the system target file to optimize the code for embedded real-time systems, and choose Faster Runs
for the build configuration to prioritize execution speed.
set_param(mdl,'SystemTargetFile','ert.tlc'); set_param(mdl,'BuildConfiguration','Faster Runs');
Set the code replacement libraries to ARM Cortex-A CMSIS
.
set_param(mdl,'CodeReplacementLibrary','ARM Cortex-A CMSIS');
Configure the code generation report to generate and open automatically, and show blocks that triggered code replacement.
set_param(mdl,'GenerateReport','On'); set_param(mdl,'LaunchReport','On'); set_param(mdl,'GenerateCodeReplacementReport','On');
Simulate on Target using SIL/PIL Manager
Use the SIL/PIL Manager app to simulate the code on the target and to get the execution time of the generated code.
Follow these steps to perform SIL simulation:
Go to Apps > SIL/PIL Manager
Set Mode to Automated Verification
Set the SIL/PIL Mode to Processor-in-loop (PIL)
Click Run Verification
In the SIL/PIL Manager tab, click on 'Run SIL/PIL'. Once the artifacts are built successfully, you can check replacements from the code generation report. Alternatively, you can execute the following command to run the SIL simulation.
set_param(mdl,'SimulationMode', 'processor-in-the-loop (pil)'); sim(mdl);
### Starting build procedure for: ifir_example ### Generating code and artifacts to 'Model specific' folder structure ### Generating code into build folder: C:\Example\ifir_example_ert_rtw ### Invoking Target Language Compiler on ifir_example.rtw ### Using System Target File: Z:\35\balavia.Bembed.j2616326.bash\matlab\rtw\c\ert\ert.tlc ### Loading TLC function libraries ........ ### Initial pass through model to cache user defined code . ### Caching model source code ..................................................... ### Writing header file ifir_example_types.h ### Writing header file ifir_example.h ### Writing source file ifir_example.c . ### Writing header file rtwtypes.h ### Writing header file Interpolated_FIR.h ### Writing source file Interpolated_FIR.c ### Writing header file ifir_example_private.h ### Writing source file ifir_example_data.c . ### Writing header file rtmodel.h ### Writing source file ert_main.c ### TLC code generation complete (took 6.256s). ### Saving binary information cache. ### Using toolchain: GNU GCC Embedded Linux ### Creating 'C:\Example\ifir_example_ert_rtw\instrumented\ifir_example.mk' ... ### Building 'ifir_example': make -j$(($(nproc)+1)) -Otarget -f ifir_example.mk buildobj ### Creating HTML report file index.html ### Successful completion of build procedure for: ifir_example ### Simulink cache artifacts for 'ifir_example' were created in 'C:\Example\ifir_example.slxc'. Build Summary Top model targets: Model Build Reason Status Build Duration =========================================================================================== ifir_example Generated code was out of date. Code generated and compiled. 0h 0m 53.134s 1 of 1 models built (0 models already up to date) Build duration: 0h 0m 55.559s ### Connectivity configuration for component "ifir_example": Raspberry Pi ### ### Preparing to start PIL simulation ... Building with 'Microsoft Visual C++ 2022 (C)'. MEX completed successfully. ### Using toolchain: GNU GCC Embedded Linux ### Creating 'C:\Example\ifir_example_ert_rtw\coderassumptions\lib\ifir_example_ca.mk' ... ### Building 'ifir_example_ca': make -j$(($(nproc)+1)) -Otarget -f ifir_example_ca.mk all ### Using toolchain: GNU GCC Embedded Linux ### Creating 'C:\Example\ifir_example_ert_rtw\pil\ifir_example.mk' ... ### Building 'ifir_example': make -j$(($(nproc)+1)) -Otarget -f ifir_example.mk all ### Updating code generation report with PIL files ... ### Starting application: 'ifir_example_ert_rtw\pil\ifir_example.elf' ### Launching application ifir_example.elf... ### Host application produced the following standard output (stdout) and standard error (stderr) messages:
You can view the code execution metrics by clicking either 'Code Profile Analyzer' or 'Code execution profiling report'. ifir_example_step
section corresponds to the IFIR subsystem. To compare the performance of the generated code, use the Average Execution Time in ns for the ifir_example_step
.
Generate Code and Compare Performance
Use this interactive section to compare the performance of the generated code with the plain C code.You select the ARM Cortex-A CMSIS Code Replacement Library or Neon V7 Instruction Set Extension to optimize the generated code from the drop down list.
compare ="CRL";comparewith="None";
To get a better average, set the sample time of the Gaussian Noise to 1 and, stop time to 10000 so that the function is called 10001 times.
set_param(mdl,'StopTime','10000'); set_param([mdl,'/Gaussian Noise'],'SampTime','1'); set_param(mdl,'FixedStep','1'); if compare == "CRL" set_param(mdl,'CodeReplacementLibrary','ARM Cortex-A CMSIS'); else set_param(mdl,'InstructionSetExtensions',compare); set_param(mdl,'OptimizeReductions','on'); set_param(mdl,'OptimizationLevel','level2'); set_param(mdl,'OptimizationPriority','Speed'); end out = sim(mdl);
### Starting build procedure for: ifir_example ### Generating code and artifacts to 'Model specific' folder structure ### Generating code into build folder: C:\Example\ifir_example_ert_rtw ### Invoking Target Language Compiler on ifir_example.rtw ### Using System Target File: Z:\35\balavia.Bembed.j2616326.bash\matlab\rtw\c\ert\ert.tlc ### Loading TLC function libraries ........ ### Initial pass through model to cache user defined code . ### Caching model source code ........................................................... ### Writing header file ifir_example_types.h . ### Writing header file ifir_example.h ### Writing source file ifir_example.c ### Writing header file rtwtypes.h ### Writing header file Interpolated_FIR.h ### Writing source file Interpolated_FIR.c . ### Writing header file ifir_example_private.h ### Writing source file ifir_example_data.c ### Writing header file rtmodel.h ### Writing source file ert_main.c ### TLC code generation complete (took 5.273s). ### Saving binary information cache. ### Using toolchain: GNU GCC Embedded Linux ### Creating 'C:\Example\ifir_example_ert_rtw\instrumented\ifir_example.mk' ... ### Building 'ifir_example': make -j$(($(nproc)+1)) -Otarget -f ifir_example.mk buildobj ### Creating HTML report file index.html ### Successful completion of build procedure for: ifir_example ### Simulink cache artifacts for 'ifir_example' were created in 'C:\Example\ifir_example.slxc'. Build Summary Top model targets: Model Build Reason Status Build Duration =========================================================================================== ifir_example Generated code was out of date. Code generated and compiled. 0h 0m 44.992s 1 of 1 models built (0 models already up to date) Build duration: 0h 0m 46.593s ### Connectivity configuration for component "ifir_example": Raspberry Pi ### ### Preparing to start PIL simulation ... Building with 'Microsoft Visual C++ 2022 (C)'. MEX completed successfully. ### Using toolchain: GNU GCC Embedded Linux ### Creating 'C:\Example\ifir_example_ert_rtw\coderassumptions\lib\ifir_example_ca.mk' ... ### Building 'ifir_example_ca': make -j$(($(nproc)+1)) -Otarget -f ifir_example_ca.mk all ### Using toolchain: GNU GCC Embedded Linux ### Creating 'C:\Example\ifir_example_ert_rtw\pil\ifir_example.mk' ... ### Building 'ifir_example': make -j$(($(nproc)+1)) -Otarget -f ifir_example.mk all ### Updating code generation report with PIL files ... ### Starting application: 'ifir_example_ert_rtw\pil\ifir_example.elf' ### Launching application ifir_example.elf... ### Host application produced the following standard output (stdout) and standard error (stderr) messages:
Get the total execution time of the generated code for comparison.
profileSectionIndex = 4;
tcompare = out.get('executionProfile').Sections(profileSectionIndex).TotalExecutionTimeInTicks;
Set the instruction set and code replacement library to none
to generate Plain C code.
set_param(mdl,'InstructionSetExtensions','None'); set_param(mdl,'CodeReplacementLibrary','None'); out = sim(mdl);
### Starting build procedure for: ifir_example ### Generating code and artifacts to 'Model specific' folder structure ### Generating code into build folder: C:\Example\ifir_example_ert_rtw ### Invoking Target Language Compiler on ifir_example.rtw ### Using System Target File: Z:\35\balavia.Bembed.j2616326.bash\matlab\rtw\c\ert\ert.tlc ### Loading TLC function libraries ........ ### Initial pass through model to cache user defined code . ### Caching model source code ........................................................... ### Writing header file ifir_example_types.h ### Writing header file ifir_example.h ### Writing source file ifir_example.c . ### Writing header file rtwtypes.h ### Writing header file Interpolated_FIR.h ### Writing source file Interpolated_FIR.c ### Writing header file ifir_example_private.h ### Writing source file ifir_example_data.c . ### Writing header file rtmodel.h ### Writing source file ert_main.c ### TLC code generation complete (took 4.804s). ### Saving binary information cache. ### Using toolchain: GNU GCC Embedded Linux ### Creating 'C:\Example\ifir_example_ert_rtw\instrumented\ifir_example.mk' ... ### Building 'ifir_example': make -j$(($(nproc)+1)) -Otarget -f ifir_example.mk buildobj ### Creating HTML report file index.html ### Successful completion of build procedure for: ifir_example ### Simulink cache artifacts for 'ifir_example' were created in 'C:\Example\ifir_example.slxc'. Build Summary Top model targets: Model Build Reason Status Build Duration =========================================================================================== ifir_example Generated code was out of date. Code generated and compiled. 0h 0m 29.953s 1 of 1 models built (0 models already up to date) Build duration: 0h 0m 31.012s ### Connectivity configuration for component "ifir_example": Raspberry Pi ### ### Preparing to start PIL simulation ... Building with 'Microsoft Visual C++ 2022 (C)'. MEX completed successfully. ### Using toolchain: GNU GCC Embedded Linux ### Creating 'C:\Example\ifir_example_ert_rtw\coderassumptions\lib\ifir_example_ca.mk' ... ### Building 'ifir_example_ca': make -j$(($(nproc)+1)) -Otarget -f ifir_example_ca.mk all ### Using toolchain: GNU GCC Embedded Linux ### Creating 'C:\Example\ifir_example_ert_rtw\pil\ifir_example.mk' ... ### Building 'ifir_example': make -j$(($(nproc)+1)) -Otarget -f ifir_example.mk all ### Updating code generation report with PIL files ... ### Starting application: 'ifir_example_ert_rtw\pil\ifir_example.elf' ### Launching application ifir_example.elf... ### Host application produced the following standard output (stdout) and standard error (stderr) messages:
Get the execution time of the generated plain C code for comparison.
tcomparewith = out.get('executionProfile').Sections(profileSectionIndex).TotalExecutionTimeInTicks;
close_system(mdl,0);
Compare the performance.
performanceGain = single(tcomparewith) ./ single(tcompare)
performanceGain = single
2.1060
The ARM Cortex-A CMSIS CRL achieves a performance gain of about 2.2x compared to plain C code, while the Neon V7 instruction set extension shows a performance gain of about 1.2x compared to plain C code.
Note: This example compares the performance within a 32-bit Raspberry Pi environment. Note that the performance numbers may vary in your Raspberry Pi environment.