Main Content

Synthesis Benchmark of Common Native Floating Point Operators

This example shows how to access and generate synthesis benchmarks for common native floating-point operators with Xilinx® Vivado® and Intel® Quartus® tool.

Access Generated Synthesis Results from a MAT File

Perform synthesis and timing analysis on common operators with Xilinx Vivado and Intel Quartus tool. These operators include basic math operators such as addition and subtraction, as well as more complex operators such as log, sin, cos, and atan. In this example, you configure the Simulink® models for synthesis with native floating-point mode in single-precision floating-point type, and utilizes different latency and mantissa multiplication strategies.

In this example, you use benchmarking data that has already been generated for the hdl_nfp_single_ops_benchmark model. The benchmarking data is in the hdlcoder_synthesis_benchmarkNFP MAT file. The MAT file contains the data for subsystems whose names start with op_. Each subsystem corresponds to a device under test (DUT).

Open the Simulink model.

addpath(fullfile(matlabroot, 'toolbox', 'hdlcoder', 'hdlcoder', 'hdlutils', 'hdlBenchMarking'))
open_system('hdl_nfp_single_ops_benchmark')

To generate the synthesis result from the MAT file:

1. Make a copy of the Simulink model in a directory for which you have write permission.

2. Run the runBenchmarkNFP function by running these commands.

addpath(fullfile(matlabroot, 'toolbox', 'hdlcoder', 'hdlcoder', 'hdlutils', 'hdlBenchMarking'))
Results = runBenchmarkNFP;

Generating synthesis benchmarks with the hdl_nfp_single_ops_benchmark model takes several days to finish benchmarking.

To view the MAT file that contains synthesis results.

load('hdlcoder_synthesis_benchmarkNFP.mat')
NFPSynthesisResults
NFPSynthesisResults = 

  struct with fields:

     Vivado: [1×1 struct]
    Quartus: [1×1 struct]

The MAT file contains a structure named NFPSynthesisResults, which has two fields. The Vivado field contains the synthesis results of the Xilinx Vivado. The Quartus field contains the synthesis results of the Intel Quartus. View synthesis results of Vivado tool.

NFPSynthesisResults.Vivado
ans = 

  struct with fields:

                          Auto: [1×1 struct]
                FullMultiplier: [1×1 struct]
    PartMultiplierPartAddShift: [1×1 struct]
      NoMultiplierFullAddShift: [1×1 struct]

NFPSynthesisResults.Vivado and NFPSynthesisResults.Quartus contains synthesis results for different mantissa multiply strategies:

  • Auto - Structure of synthesis results using Auto mantissa multiply strategy

  • FullMultiplier - Structure of synthesis results using Full Multiplier implementation

  • PartMultiplierPartAddShift - Structure of synthesis results using Part Multiplier part AddShift implementation

  • NoMultiplierFullAddShift - Structure of synthesis results using No Multiplier Full AddShift implementation

For more information, see Mantissa Multiplier Strategy.

Each of these structures contain synthesis results for different latency strategies.

NFPSynthesisResults.Vivado.Auto
ans = 

  struct with fields:

       MaxLatency: [14×13 table]
       MinLatency: [14×13 table]
      ZeroLatency: [14×13 table]
    CustomLatency: [58×13 table]

  • MaxLatency - Table of synthesis results using the MAX latency strategy

  • MinLatency - Table of synthesis results using the MIN latency strategy

  • ZeroLatency - Table of synthesis results using the ZERO latency strategy

  • CustomLatency - Table of synthesis results using the latency values from zero to the maximum latency

The NFPSynthesisResults.Vivado.Auto.CustomLatency and NFPSynthesisResults.Quartus.Auto.CustomLatency contains list of performance and resource utilization of the floating-point operators for latency values from zero to max latency with Auto mantissa multiply strategy. View the complete list of performance and resource utilization results for Xilinx Vivado:

NFPSynthesisResults.Vivado.Auto.CustomLatency
ans =

  58×13 table

                                                     Fmax     Slices    SliceRegs    LUTs    DSPs    RAMs    URAMs    Latency    DataPathDelay    Slack     LogicLevels    LogicDelay    RouteDelay
                                                    ______    ______    _________    ____    ____    ____    _____    _______    _____________    ______    ___________    __________    __________

    add_sys_vivado_Auto_latency0                    83.928     180          96       593      0       0        0         0          11.887        -9.915        35           3.042         8.845   
    add_sys_vivado_Auto_latency1                    167.64     149         127       498      0       0        0         0           5.952        -3.965        22           2.219         3.733   
    add_sys_vivado_Auto_latency2                    221.09     150         208       444      0       0        0         1           4.393        -2.523        13           1.512         2.881   
    add_sys_vivado_Auto_latency3                    267.52     171         281       508      0       0        0         2           3.741        -1.738         8           0.603         3.138   
    add_sys_vivado_Auto_latency4                    311.82     186         340       524      0       0        0         3           3.179        -1.207         7            0.97         2.209   
    add_sys_vivado_Auto_latency5                    306.75     162         387       444      0       0        0         4            3.23         -1.26        10           0.898         2.332   
    add_sys_vivado_Auto_latency6                    311.14     191         436       536      0       0        0         5           3.218        -1.214         9            0.95         2.267   
    add_sys_vivado_Auto_latency7                    314.07     201         502       542      0       0        0         6            3.18        -1.184         6           0.714         2.466   
    add_sys_vivado_Auto_latency8                    365.23     184         563       483      0       0        0         7           2.709        -0.738         7           0.931         1.778   
    add_sys_vivado_Auto_latency9                    406.67     203         562       541      0       0        0         8            2.43        -0.459         6            0.91          1.52   
    add_sys_vivado_Auto_latency10                   399.84     191         646       506      0       0        0         9           2.473        -0.501         8           1.014         1.459   
    add_sys_vivado_Auto_latency11                   457.25     205         668       521      0       0        0        10           2.191        -0.187         5            0.62         1.571   
    sub_sys_vivado_Auto_latency0                    86.252     166          96       573      0       0        0         0          11.566        -9.594        38            3.36         8.206   
    sub_sys_vivado_Auto_latency1                    164.74     152         127       499      0       0        0         0           6.084         -4.07        23           2.105         3.979   
    sub_sys_vivado_Auto_latency2                    202.63     134         208       444      0       0        0         1           4.569        -2.935        13           1.164         3.405   
    sub_sys_vivado_Auto_latency3                    262.47     171         281       504      0       0        0         2           3.781         -1.81         8           0.603         3.178   
    sub_sys_vivado_Auto_latency4                    320.72     187         341       527      0       0        0         3           3.034        -1.118         9           0.854          2.18   
    sub_sys_vivado_Auto_latency5                    287.52     175         387       445      0       0        0         4           3.449        -1.478         8            0.77         2.679   
    sub_sys_vivado_Auto_latency6                    322.27     201         438       541      0       0        0         5           3.104        -1.103         9           1.272         1.832   
    sub_sys_vivado_Auto_latency7                    304.79     196         510       544      0       0        0         6           3.253        -1.281        13           1.456         1.797   
    sub_sys_vivado_Auto_latency8                     391.7     195         562       483      0       0        0         7           2.524        -0.553         7           0.952         1.572   
    sub_sys_vivado_Auto_latency9                    417.01     208         562       543      0       0        0         8            2.37        -0.398         6            0.89          1.48   
    sub_sys_vivado_Auto_latency10                   351.49     194         647       506      0       0        0         9           2.817        -0.845         9           1.053         1.764   
    sub_sys_vivado_Auto_latency11                   491.88     203         668       522      0       0        0        10           1.935        -0.033         4            0.53         1.405   
    singletonumerictype_sys_vivado_Auto_latency0    218.87     104          64       345      0       0        0         0            4.54        -2.569        16           1.418         3.122   
    singletonumerictype_sys_vivado_Auto_latency1     775.8      62          66       222      0       0        0         0           1.259         0.711         3           0.445         0.814   
    singletonumerictype_sys_vivado_Auto_latency2    465.12     107          99       346      0       0        0         1           2.154         -0.15         4           0.431         1.723   
    singletonumerictype_sys_vivado_Auto_latency3    461.89     110         157       365      0       0        0         2           2.136        -0.165         3           0.352         1.784   
    singletonumerictype_sys_vivado_Auto_latency4    462.32     116         189       366      0       0        0         3           2.134        -0.163         4           0.431         1.703   
    singletonumerictype_sys_vivado_Auto_latency5     468.6     109         222       343      0       0        0         4           2.106        -0.134         5           0.438         1.668   
    singletonumerictype_sys_vivado_Auto_latency6    459.35     113         211       366      0       0        0         5           2.178        -0.177         5           0.438          1.74   
    mul_sys_vivado_Auto_latency0                    103.21     135          96       415      1       0        0         0           9.692        -7.689        23           4.765         4.927   
    mul_sys_vivado_Auto_latency1                    218.77     147         164       377      1       0        0         0           1.984        -2.571         4           0.431         1.553   
    mul_sys_vivado_Auto_latency2                    406.17     144         260       383      1       0        0         1           2.083        -0.462         3           0.414         1.669   
    mul_sys_vivado_Auto_latency3                    262.12     153         314       401      1       0        0         2           3.472        -1.815        17           1.814         1.658   
    mul_sys_vivado_Auto_latency4                    336.81     160         357       409      1       0        0         3           2.602        -0.969        14           1.286         1.316   
    mul_sys_vivado_Auto_latency5                    439.56     172         443       386      1       0        0         4           1.895        -0.275         3           0.388         1.507   
    mul_sys_vivado_Auto_latency6                    261.92     166         448       391      1       0        0         5           1.231        -1.818         1           0.266         0.965   
    mul_sys_vivado_Auto_latency7                    455.37     163         543       391      1       0        0         6           0.566        -0.196         0           0.223         0.343   
    mul_sys_vivado_Auto_latency8                    472.81     178         673       389      1       0        0         7           1.966        -0.115        11           1.095         0.871   
    round_sys_vivado_Auto_latency0                  310.46      67          64       195      0       0        0         0           3.191        -1.221         9           1.031          2.16   
    round_sys_vivado_Auto_latency1                  339.44      74         127       205      0       0        0         0            2.95        -0.946         9           1.056         1.894   
    round_sys_vivado_Auto_latency2                  542.01      72         116       212      0       0        0         1           1.817         0.155         5           0.474         1.343   
    round_sys_vivado_Auto_latency3                  350.02      73         160       180      0       0        0         2           2.828        -0.857        11           1.193         1.635   
    round_sys_vivado_Auto_latency4                  394.63      92         246       219      0       0        0         3           2.505        -0.534        11           1.206         1.299   
    round_sys_vivado_Auto_latency5                  534.19      93         311       214      0       0        0         4           1.876         0.128         4           0.531         1.345   
    ceil_sys_vivado_Auto_latency0                   315.66      51          64       165      0       0        0         0            3.14        -1.168        11           1.514         1.626   
    ceil_sys_vivado_Auto_latency1                   412.71      66         126       188      0       0        0         0           2.394        -0.423         7            0.96         1.434   
    ceil_sys_vivado_Auto_latency2                   544.66      69         115       189      0       0        0         1           1.808         0.164         4           0.395         1.413   
    ceil_sys_vivado_Auto_latency3                   414.42      81         159       162      0       0        0         2           2.384        -0.413        11           1.442         0.942   
    ceil_sys_vivado_Auto_latency4                   424.81      85         244       194      0       0        0         3           2.324        -0.354        11           1.203         1.121   
    ceil_sys_vivado_Auto_latency5                   514.14      84         310       203      0       0        0         4           1.916         0.055         4           0.497         1.419   
    floor_sys_vivado_Auto_latency0                  312.89      53          64       165      0       0        0         0           3.168        -1.196        14           1.641         1.527   
    floor_sys_vivado_Auto_latency1                  411.69      61         125       187      0       0        0         0           2.401        -0.429         7           0.924         1.477   
    floor_sys_vivado_Auto_latency2                  515.46      64         115       189      0       0        0         1           1.912          0.06         4           0.496         1.416   
    floor_sys_vivado_Auto_latency3                  398.57      73         158       162      0       0        0         2            2.48        -0.509        12           1.584         0.896   
    floor_sys_vivado_Auto_latency4                  454.34      91         243       194      0       0        0         3           2.172        -0.201        11            1.22         0.952   
    floor_sys_vivado_Auto_latency5                  542.89      96         308       203      0       0        0         4           1.814         0.158         4           0.395         1.419   

Generate Synthesis Results

To generate synthesis results for other Simulink models or different settings, you can pass input arguments to the runBenchmarkNFP function. Specify name-value arguments, where name is the argument name and value is the corresponding value.

  • OpList - List of operators, specified as a cell array of character vectors. The default value is {'ADDSUB','CONVERT','COS','MINMAX','MUL','MULTADD','ROUNDING','POW'}.

  • TargetFrequency - Target frequency in MHz. The default value is 500 MHz.

  • HardwareTool - Hardware applications for generating results, specified as a cell array of character vectors. The default value is {'Vivado','Quartus'}.

  • LatencyStrategy - Latency strategy, specified as a cell array of character vectors. The default value is {'Min','Max','Zero','Custom'}.

  • MantissaMultiplyStrategy - Mantissa multiply strategy, specified as a cell array of character vectors. The default value is {'Auto','FullMultiplier','PartMultiplierPartAddShift','NoMultiplierFullAddShift'}

  • DataType - Datatype used for the benchmarking, specified as character vector. The default value is "single".

  • VivadoHardwareDetails or QuartusHardwareDetails - Hardware details, specified as a structure containing these five fields:

Tool - Synthesis tool name, specified as character vector.

ChipFamily - Chip family, specified as character vector.

DeviceName - Device name, specified as character vector.

PackageName - Package name, specified as character vector.

SpeedValue - Hardware speed, specified as character vector.

For example, set the hardware details for Xilinx Vivado and Altera Quartus II.

vivado_hardware_details.Tool = 'Xilinx Vivado';
vivado_hardware_details.ChipFamily = 'Virtex7';
vivado_hardware_details.DeviceName = 'xc7v2000t';
vivado_hardware_details.PackageName = 'fhg1761';
vivado_hardware_details.SpeedValue = '-2';
vivado_hardware_details =
     struct with fields:
              Tool: 'Xilinx Vivado'
        ChipFamily: 'Virtex7'
        DeviceName: 'xc7v2000t'
       PackageName: 'fhg1761'
        SpeedValue: '-2'
quartus_hardware_details.Tool = 'Altera QUARTUS II';
quartus_hardware_details.ChipFamily = 'Stratix V';
quartus_hardware_details.DeviceName = '5SGSMD4E1H29C1';
quartus_hardware_details.PackageName = '';
quartus_hardware_details.SpeedValue = '';
quartus_hardware_details =
     struct with fields:
              Tool: 'Altera QUARTUS II'
        ChipFamily: 'Stratix V'
        DeviceName: '5SGSMD4E1H29C1'
       PackageName: ''
        SpeedValue: ''

Call the runBenchmarkNFP function for the specified name-value arguments.

runNFPBenchmarkResults = runBenchmarkNFP('OpList', {'ADDSUB','SQRT','UMINUS'},...
                                         'TargetFrequency',500,...
                                         'VivadoHardwareDetails', vivado_hardware_details,...
                                         'QuartusHardwareDetails', quartus_hardware_details,...
                                         'HardwareTool',{'Vivado','Quartus'},...
                                         'LatencyStrategy',{'Min','Max','Zero','Custom'},...
                                         'MantissaMultiplyStrategy',{'Auto','FullMultiplier',
                                         'PartMultiplierPartAddShift','NoMultiplierFullAddShift'},...
                                         'DataType', 'single');

Customize Synthesis Results

Generating synthesis benchmarks with the hdl_nfp_single_ops_benchmark model takes several days to finish benchmarking. To generate benchmarks for only a subset of the operators, specify list of operators you are interested in benchmarking to the OpList argument. Operators such as ADDSUB, CONVERT, ROUNDING, and MINMAX, generate separate blocks for different operations. For example, when you specify MINMAX operator to the OpList, the function generates synthesis results for both Min and MAX operations.

To generate benchmark synthesis results with your own model using the runBenchmarkNFP function, ensure that:

  • Each DUT or subsystem name starts with op_.

  • For custom latency benchmarking, each DUT name contains characters required to get the list of operators that support custom latency. See purgeDutList function in runBenchmarkNFP_Vivado and runBenchmarkNFP_Quartus. To view the list of blocks that supports custom latency, see Latency Values of Floating-Point Operators.

Related Links