Reinforcement Learning Toolbox

Design and train policies using reinforcement learning


Reinforcement Learning Toolbox™ provides functions and blocks for training policies using reinforcement learning algorithms including DQN, A2C, and DDPG. You can use these policies to implement controllers and decision-making algorithms for complex systems such as robots and autonomous systems. You can implement the policies using deep neural networks, polynomials, or look-up tables.

The toolbox lets you train policies by enabling them to interact with environments represented by MATLAB® or Simulink® models. You can evaluate algorithms, experiment with hyperparameter settings, and monitor training progress. To improve training performance, you can run simulations in parallel on the cloud, computer clusters, and GPUs (with Parallel Computing Toolbox™ and MATLAB Parallel Server™).

Through the ONNX™ model format, existing policies can be imported from deep learning frameworks such as TensorFlow™ Keras and PyTorch (with Deep Learning Toolbox™). You can generate optimized C, C++, and CUDA code to deploy trained policies on microcontrollers and GPUs.

The toolbox includes reference examples for using reinforcement learning to design controllers for robotics and automated driving applications.

Get Started:

Reinforcement Learning Agents

Implement MATLAB and Simulink agents to train policies represented by deep neural networks. Use built-in and custom reinforcement learning algorithms.

Reinforcement Learning Algorithms

Implement agents using Deep Q-Network (DQN), Advantage Actor Critic (A2C), Deep Deterministic Policy Gradients (DDPG), and other built-in algorithms. Use templates to implement custom agents for training policies.

Agents comprise a policy and an algorithm.

Policy and Value Function Representation Using Deep Neural Networks

Use deep neural network policies for complex systems with large state-action spaces. Define policies using networks and architectures from Deep Learning Toolbox. Import ONNX models for interoperability with other deep learning frameworks.

Simulink Blocks for Agents

Implement and train reinforcement learning agents in Simulink.

Reinforcement Learning Agent block for Simulink.

Environment Modeling

Create MATLAB and Simulink environment models. Describe system dynamics and provide observation and reward signals for training agents.

Simulink and Simscape Environments

Use Simulink and Simscape™ models to represent an environment. Specify the observation, action, and reward signals within the model.

Simulink environment model for an inverted pendulum.

MATLAB Environments

Use MATLAB functions and classes to represent an environment. Specify observation, action, and reward variables within the MATLAB file.

MATLAB environment for cart-pole system.

Accelerating Training

Speed up training using GPU, cloud, and distributed computing resources.

Distributed Computing and Multicore Acceleration

Speed up training by running parallel simulations on multicore computers, cloud resources, or compute clusters using Parallel Computing Toolbox and MATLAB Parallel Server.

Speed up training using parallel computing.

GPU Acceleration

Speed up deep neural network training and inference with high-performance NVIDIA® GPUs. Use MATLAB with Parallel Computing Toolbox and most CUDA®-enabled NVIDIA GPUs that have compute capability 3.0 or higher.

Accelerate training using GPUs.

Code Generation and Deployment

Deploy trained policies to embedded devices or integrate them with a wide range of production systems.

Code Generation

Use GPU Coder™ to generate optimized CUDA code from MATLAB code representing trained policies. Use MATLAB Coder™ to generate C/C++ code to deploy policies.

Generate CUDA code using GPU Coder.

MATLAB Compiler Support

Use MATLAB Compiler™ and MATLAB Compiler SDK™ to deploy trained policies as C/C++ shared libraries, Microsoft® .NET assemblies, Java® classes, and Python® packages.

Package and share policies as standalone programs.

Reference Examples

Design controllers using reinforcement learning for robots, self-driving cars, and other systems.

Getting Started

Implement reinforcement-learning-based controllers for problems such as balancing an inverted pendulum, navigating a grid-world problem, and balancing a cart-pole system.

Solving a grid world maze.

Automated Driving Applications

Design controllers for adaptive cruise control and lane keeping assist systems.

Training a lane keeping assistance system.

Latest Features

Multi-Agent Reinforcement Learning

Train multiple agents simultaneously in a Simulink environment

Soft Actor-Critic Agent

Train sample-efficient policies for environments with continuous action spaces using increased exploration

Default Agents

Avoid manually formulating policies by creating agents with default neural network structure

See the release notes for details on any of these features and corresponding functions.

Reinforcement Learning Video Series

Watch the videos in this series to learn about reinforcement learning.