File Exchange

image thumbnail

Deep Learning Toolbox Model Quantization Library

Quantize and Compress Deep Learning models


Updated 10 Mar 2021

Deep Learning Toolbox Model Quantization Library enables quantizing and compressing of your deep learning models. It provides instrumentation services that enable you to collect layer level data on the weights, activations and intermediate computations during the calibration step. Using instrumentation data, the library/add-on enables quantization of your model and it provides metrics to validate the accuracy of the quantized network.

The library/add-on enables an iterative workflow to optimize the quantization approach to meet the required accuracy. It provides heuristics to choose the right quantization strategy.
You can validate the quantized network and compare the accuracy against the single precision baseline.

The library/add-on provides a Quantization app that lets you analyze and visualize the instrumentation data to understand the tradeoff on the accuracy of quantizing the weights and biases of selected layers.
The library/add-on supports INT8 quantization for FPGAs and NVIDIA GPUs, for supported layers.

Please refer to the documentation here:

This hardware support package is functional for R2020a and beyond. Quantization of a neural network targeting GPUs requires the GPU Coder™ Interface for Deep Learning Libraries support package. R2020b adds support for quantization of a neural network targeting FPGAs and requires Deep Learning HDL Toolbox™.

Quantization Workflow Prerequisites can be found on this page:

If you have download or installation problems, please contact Technical Support -

Comments and Ratings (20)

Ali Al-Saegh

I face the thing that yang li has mentioned on 5 Dec 2020.
After quantizing my network and exporting it as a dlquantizer object, I see in that object the same network before the quantization process. Please, how to get the quantized version of the network?

Burak Can Toker

Hi Yukui Luo -
A prerequisite for supporting FPGAs is MATLAB® Coder™ Interface for Deep Learning Libraries.

Yukui Luo

meet error while setup the environment as 'FPGA'
dlquantObj = dlquantizer(snet,'ExecutionEnvironment','FPGA');
and I got:
Error using dlquantizer
Unable to resolve the name dltargets.mkldnn.SupportedLayerImpl.m_sourceFiles.
Anybod can help?

yang li

When I use the Deep Learning Toolbox Model Quantization Library to quantify the network, how do I get the network weight of the int8 data type?
From my observation, the weight of the exported network is still a single type. And the same as the network parameters before quantization.

Even after I use TensorRT to generate cuda code. Read back to the bin file, it is still single type.

So, did I miss some details to get the network weight of the int8 data type?
Thank you


This library/add-on supports INT8 quantization for Jetson nano? jetson nano has compute capability 5.3,not support??? I refer to the documentation ,compute capability>=6.1
annother question, when does support Opset version 10+ calibrated quantized-onnx to import and export? such as quantized resnet50? 链接: 提取码:afba



To Address Invalid GPU Execution Environment Error:

Quantization of a neural network requires a GPU, the GPU Coder™ Interface for Deep Learning Libraries support package, and the Deep Learning Toolbox™ Model Quantization Library support package.
Using a GPU requires a CUDA® enabled NVIDIA® GPU with compute capability 6.1 or higher excluding 6.2.

If you still run into an "Invalid execution environment error" even with the above dependencies satisfied, it's likely that the requisite environments for cuDNN and TensorRT are not set correctly. To check, execute coder.checkGPUInstall in the command window.

Addressing invalid cuDNN and TensorRT environments can be found in the answer by Jaya here:

As always, if there are any additional issues, please contact Technical Support -

Ashwathi Nambiar

Ashwathi Nambiar

Denis Navarro

After I installed the support package, command calResults = calibrate(quantObj, aug_calData) return:

Error using dlquantization.instrument
The value of 'executionEnvironment' is invalid. No GPU available. dlquantizer requires a GPU machine to quantize a network object.
Error in dlquantizer/calibrate (line 25)
results = dlquantization.instrument(obj.NetworkObject,, obj.DLAccelData,'BatchSize',p.Results.batchSize,'MiniBatchSize',p.Results.miniBatchSize,'ExecutionEnvironment',obj.ExecutionEnvironment);

But I have a GPU.

CUDADevice with properties:

Name: 'GeForce GTX 1080 Ti'
Index: 1
ComputeCapability: '6.1'
SupportsDouble: 1
DriverVersion: 10.2000
ToolkitVersion: 10.1000
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 1.1811e+10
AvailableMemory: 9.9493e+09
MultiprocessorCount: 28
ClockRateKHz: 1683000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1


Hi Yang,
Thanks for your feedback! Based on the error message you provided, it was a generic error reported from an incorrect function call. It is a bit hard to locate the issue without reproduction steps.
We recently updated the documentation. Please see if it provides any reference:
Can the same network run successfully in Deep Network Designer App?
Please contact our Technical Support and report the issue. We will follow up with you directly.

yang li

I am quantifying a model of my own.The network model comes from the onnx import layer.
I created new imds for training. And use the same imds for calibration.
When I was calibrating, it reported an error. It prompts "Cannot perform assignment with 0 elements on the right".

Are there any errors in the above steps?
Thank you.

Dor Rubin

Vaidehi Venkatesan

Frequently Asked Questions (April 2020)
Q: After I installed the support package, running command dlquantizer(net) errors out in "coder.internal.getSupportedLayerTypes".
A: Quantization function requires a GPU and the GPU Coder Interface for Deep Learning Libraries:
Successful installation of both packages will eliminate the errors.

Q: What is INT8 Quantization and what is it for?
A: A technical article on the concepts can be found here:

Q: How can I start with a simple example?
A: You can try quantize the squeezenet neural network after retraining the network to classify new images according to the Train Deep Learning Network to Classify New Images ( example. The memory required for the network can be reduced to approximately 75% while the accuracy of the network is almost the same.
Type >> help dlquantizer and find the documented examples to get started.

Q: My network object does seem to be supported. Why?
A: Leave a comment here and we will contact you.


MATLAB Release Compatibility
Created with R2020a
Compatible with R2020a to R2021a
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!