Main Content

Benchmarking

(To be removed) Add benchmarking to the generated code

The Benchmarking parameter will be removed in a future release. To measure the performance of generated code, use the gpuPerformanceAnalyzer function instead.

Description

App Configuration Pane: GPU Code

Configuration Objects: coder.GpuCodeConfig

The Benchmarking parameter controls the addition of benchmarking code to the generated CUDA® code.

After execution, the generated benchmarking code creates the gpuTimingData comma-separated values (CSV) file in the current working folder. The CSV file contains the timing data for kernel, memory, and other events. This table describes the format of the CSV file.

Event TypeFormat

CUDA kernels

<name_N>,<block dimension>,<grid dimension>,<execution time in ms>,<name of parent>

N is the nth execution of the kernel. block dimension represents the total block dimension. For example if the block dimension is dim3(32,32,32), then the block dimension value is 32768.

CUDA memory copy

<name_N>,<memory copy size>,<execution time in ms>,<IO flag>,<name of parent>

N is the nth execution of the memory copy.

Miscellaneous

<name_N>,<execution time in ms>,<name of parent>

N is the nth execution of the operation.

Settings

off (default) | on

Off

Does not generate CUDA code with benchmarking functionality.

On

Generates CUDA code with benchmarking functionality. This option uses CUDA APIs such as cudaEvent to time kernel, memcpy, and other events.

Programmatic Use

Property: Benchmarking
Values: true | false
Default: false

Version History

Introduced in R2018a

collapse all