Main Content

Data type (TensorRT)

Inference computation precision

Description

App Configuration Pane: Deep Learning

Configuration Objects: coder.TensorRTConfig

Specify the precision of the inference computations in supported layers. When performing inference in 32-bit floats, use 'fp32'. For half-precision, use 'fp16'. For 8-bit integer, use 'int8'. Default value is 'fp32'.

INT8 precision requires a CUDA® GPU with minimum compute capability of 6.1, 7.0 or higher. Compute capability of 6.2 does not support INT8 precision. FP16 precision requires a CUDA GPU with minimum compute capability of 5.3, 6.0, 6.2 or higher. Use the ComputeCapability property of the GpuConfig object to set the appropriate compute capability value.

See the Deep Learning Prediction with NVIDIA TensorRT Library example for 8-bit integer prediction for a logo classification network by using TensorRT.

Dependencies

To enable this parameter, you must set Deep learning library to TensorRT.

Settings

fp32

This setting is the default setting.

Inference computation is performed in 32-bit floats.

fp16

This setting is the default setting.

Inference computation is performed in 16-bit floats.

int8

Inference computation is performed in 8-bit integers.

Programmatic Use

Property: DataType
Values: 'fp32' | 'fp16' | 'int8'
Default: 'fp32'

Version History

Introduced in R2018b