Deep Learning Network Composition

To create a custom layer that itself defines a neural network, you can declare a dlnetwork object as a learnable parameter in the properties (Learnable) section of the layer definition. This method is known as network composition. You can use network composition to:

Create a network with control flow, for example, a network with a section that can dynamically change depending on the input data.
Create a network with loops, for example, a network with sections that feed the output back into itself.
Implement weight sharing, for example, in networks where different data needs to pass through the same layers such as twin neural networks or generative adversarial networks (GANs).

For nested networks that have both learnable and state parameters, for example, networks with batch normalization or LSTM layers, declare the network in the properties (Learnable, State) section of the layer definition.

For an example showing how to define a custom layer containing a learnable dlnetwork object, see Define Nested Deep Learning Layer Using Network Composition.

For an example showing how to train a network with nested layers, see Train Network with Custom Nested Layers.

To create a single layer that represents a block of layers, for example, a residual block, use a networkLayer. Network layers simplify building and editing large networks or networks with repeating components. For more information, see Create and Train Network with Nested Layers.

Automatically Initialize Learnable `dlnetwork` Objects for Training

You can create a custom layer and allow the software to automatically initialize the learnable parameters of any nested dlnetwork objects after the parent network is fully constructed. Automatic initialization of the nested network means that you do not need to keep track of the size and shape of the inputs passed to each custom layer containing a nested dlnetwork

To use the predict and forward functions for dlnetwork objects, the input data must be formatted dlarray objects. To ensure that the software passes formatted dlarray objects to the layer functions, include the Formattable class in the class definition.

classdef myLayer < nnet.layer.Layer & nnet.layer.Formattable
    ...
end

To take advantage of automatic initialization, you must specify that the constructor function creates an uninitialized dlnetwork object. The dlnetwork function with no input arguments creates an uninitialized, empty neural network. When you add layers to an uninitialized dlnetwork object, the network remains uninitialized.

function layer = myLayer

    % Initialize layer properties.
    ...

    % Define network.
    net = dlnetwork;

    % Add and connected network layers.
    ...

    layer.Network = net;
end

When the parent network is initialized, the learnable parameters of any nested dlnetwork objects are initialized at the same time. The size of the learnable parameters depends on the size of the input data of the custom layer. The software propagates the data through the nested network and automatically initializes the parameters according to the propagated sizes and the initialization properties of the layers of the nested network.

If the parent network is trained using the trainnet function, then any nested dlnetwork objects are initialized when you call the function. If the parent network is a dlnetwork, then any nested dlnetwork objects are initialized when the parent network is constructed (if the parent dlnetwork is initialized at construction) or when you use the initialize function with the parent network (if the parent dlnetwork is not initialized at construction).

If you do not want to make use of automatic initialization, you can construct the custom layer with the nested network already initialized. In this case, the nested network is initialized before the parent network. To initialize the nested network at construction, you must manually specify the size of any inputs to the nested network. This requires manually specifying the size of any inputs to the nested network. You can do so either by using input layers or by providing example inputs to the dlnetwork constructor function. Because you must specify the sizes of any inputs to the dlnetwork object, you might need to specify input sizes when you create the layer. For help determining the size of the inputs to the layer, you can use the analyzeNetwork function and check the size of the activations of the previous layers.

Predict and Forward Functions

Some layers behave differently during training and during prediction. For example, a dropout layer performs dropout only during training and has no effect during prediction. A layer uses one of two functions to perform a forward pass: predict or forward. If the forward pass is at prediction time, then the layer uses the predict function. If the forward pass is at training time, then the layer uses the forward function. If you do not require two different functions for prediction time and training time, then you can omit the forward function. When you do so, the layer uses predict at training time.

When implementing the predict and the forward functions of the custom layer, to ensure that the layers in the dlnetwork object behave in the correct way, use the predict and forward functions for dlnetwork objects, respectively.

Custom layers with learnable dlnetwork objects do not support custom backward functions.

This example code shows how to use the predict and forward functions with dlnetwork input.

function Y = predict(layer,X)
    % Predict using network.
    net = layer.Network;
    Y = predict(net,X);
end

function Y = forward(layer,X)
    % Forward pass using network.
    net = layer.Network;
    Y = forward(net,X);
end

This example code shows how to use the predict and forward functions with dlnetwork objects that have state parameters.

function [Y,state] = predict(layer,X)
    % Predict using network.
    net = layer.Network;
    [Y,state] = predict(net,X);
end

function [Y,state] = forward(layer,X)
    % Forward pass using network.
    net = layer.Network;
    [Y,state] = forward(net,X);
end

If the dlnetwork object does not behave differently during training and prediction, then you can omit the forward function. In this case, the software uses the predict function during training.

GPU Compatibility

If the layer forward functions fully support dlarray objects, then the layer is GPU compatible. Otherwise, to be GPU compatible, the layer functions must support inputs and return outputs of type gpuArray (Parallel Computing Toolbox).

Many MATLAB^® built-in functions support gpuArray (Parallel Computing Toolbox) and dlarray input arguments. For a list of functions that support dlarray objects, see List of Functions with dlarray Support. For a list of functions that execute on a GPU, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox). To use a GPU for deep learning, you must also have a supported GPU device. For information on supported devices, see GPU Computing Requirements (Parallel Computing Toolbox). For more information on working with GPUs in MATLAB, see GPU Computing in MATLAB (Parallel Computing Toolbox).