Data type (TensorRT)
Inference computation precision
Description
App Configuration Pane: Deep Learning
Configuration Objects: coder.TensorRTConfig
Specify the precision of the inference computations in supported layers. When
performing inference in 32-bit floats, use 'fp32'
. For
half-precision, use 'fp16'
. For 8-bit integer, use
'int8'
. Default value is 'fp32'
.
INT8
precision requires a CUDA® GPU with minimum compute capability of 6.1, 7.0 or higher. Compute
capability of 6.2 does not support INT8
precision.
FP16
precision requires a CUDA GPU with minimum compute capability of 5.3, 6.0, 6.2 or higher. Use the
ComputeCapability
property of the GpuConfig
object to set the appropriate compute capability value.
See the Deep Learning Prediction with NVIDIA TensorRT Library example for 8-bit integer prediction for a logo classification network by using TensorRT.
Dependencies
To enable this parameter, you must set Deep learning
library to TensorRT
.
Settings
fp32
This setting is the default setting.
Inference computation is performed in 32-bit floats.
fp16
This setting is the default setting.
Inference computation is performed in 16-bit floats.
int8
Inference computation is performed in 8-bit integers.
Programmatic Use
Property: DataType |
Values: 'fp32' |
'fp16' | 'int8' |
Default: 'fp32' |
Version History
Introduced in R2018b