This example shows how to create a custom training experiment to train a Siamese network that identifies similar images of handwritten characters. For a custom training experiment, you explicitly define the training procedure used by Experiment Manager. In this example, you implement a custom training loop to train a Siamese network, a type of deep learning network that uses two or more identical subnetworks that have the same architecture and share the same parameters and weights. Some common applications for Siamese networks include facial recognition, signature verification, and paraphrase identification.
This diagram illustrates the Siamese network architecture in this example.
To compare two images, you pass each image through one of two identical subnetworks that share weights. The subnetworks convert each 105-by-105-by-1 image to a 4096-dimensional feature vector. Images of the same class have similar 4096-dimensional representations. The output feature vectors from each subnetwork are combined through subtraction and the result is passed through a
fullyconnect operation with a single output. A sigmoid operation converts this value to a probability indicating that the images are similar (when the probability is close to 1) or dissimilar (when the probability is close to 0). The binary cross-entropy loss between the network prediction and the true label updates the network during training. For more information, see Train a Siamese Network to Compare Images.
First, open the example. Experiment Manager loads a project with a preconfigured experiment that you can inspect and run. To open the experiment, in the Experiment Browser pane, double-click the name of the experiment (
Custom training experiments consist of a description, a table of hyperparameters, and a training function. For more information, see Configure Custom Training Experiment.
The Description field contains a textual description of the experiment. For this example, the description is:
Train a Siamese network to identify similar and dissimilar images of handwritten characters. Try different weight and bias initializers for the convolution and fully connected layers in the network.
The Hyperparameters section specifies the hyperparameter values to use for the experiment. When you run the experiment, Experiment Manager trains the network using every combination of hyperparameter values specified in the hyperparameter table. This example uses the hyperparameters
BiasInitializer to specify the weight and bias initializers, respectively, for the convolution and fully connected layers in each subnetwork. For more information about these initializers, see
The Training Function specifies the training data, network architecture, training options, and training procedure used by the experiment. To inspect the training function, under Training Function, click Edit. The training function opens in MATLAB® Editor.
The input to the training function is a structure with fields from the hyperparameter table and an
object that you can use to track the progress of the training, record values of the metrics used by the training, and produce training plots. The training function returns a structure that contains the trained network, the weights for the final
fullyconnect operation for the network, and the execution environment used for training. Experiment Manager saves this output, so you can export it to the MATLAB workspace when the training is complete. The training function has five sections.
Initialize Output sets the initial value of the network and
fullyconnect weights to empty arrays to indicate that the training has not started. The experiment sets the execution environment to
"auto", so it trains and validates the network on a GPU if one is available. Using a GPU requires Parallel Computing Toolbox™ and a supported GPU device. For more information, see GPU Support by Release (Parallel Computing Toolbox).
Load and Preprocess Training and Test Data defines the training and test data for the experiment as
imageDatastore objects. The experiment uses the Omniglot data set, which consists of character sets for 50 alphabets, divided into 30 sets for training and 20 sets for testing. For more information on this data set, see Image Data Sets.
Define Network Architecture defines the architecture for two identical subnetworks that accept 105-by-105-by-1 images and output a feature vector. The convolution and fully connected layers use the weights and bias initializers specified in the hyperparameter table. To train the network with a custom training loop and enable automatic differentiation, the training function converts the layer graph to a
dlnetwork object. The weights for the final
fullyconnect operation are initialized by sampling a random selection from a narrow normal distribution with standard deviation of 0.01.
Specify Training Options defines the training options used by the experiment. In this example, Experiment Manager trains the network with a mini-batch size of 180 for 1000 iterations, computing the accuracy of the network every 100 iterations. Training can take some time to run. For better results, consider increasing the training to 10,000 iterations.
Train Model defines the custom training loop used by the experiment. For each iteration, the custom training loop extracts a batch of image pairs and labels, converts the data to
dlarray objects with underlying type single, and specifies the dimension labels
'SSCB' (spatial, spatial, channel, batch) for the image data and
'CB' (channel, batch) for the labels. If you train on a GPU, the data is converted to
gpuArray objects. Then, the training function evaluates the model gradients and updates the network parameters. To validate, the training function creates a set of five random mini-batches of test pairs, evaluates the network predictions, and calculates the average accuracy over the mini-batches. After each iteration of the custom training loop, the training function saves the trained network and the weights for the
fullyconnect operation, records the training loss, and updates the training progress.
When you run the experiment, Experiment Manager trains the network defined by the training function multiple times. Each trial uses a different combination of hyperparameter values. By default, Experiment Manager runs one trial at a time. If you have Parallel Computing Toolbox, you can run multiple trials at the same time. For best results, before you run your experiment, start a parallel pool with as many workers as GPUs. For more information, see Use Experiment Manager to Train Networks in Parallel.
To run one trial of the experiment at a time, on the Experiment Manager toolstrip, click Run.
To run multiple trials at the same time, click Use Parallel and then Run. If there is no current parallel pool, Experiment Manager starts one using the default cluster profile. Experiment Manager then executes multiple simultaneous trials, depending on the number of parallel workers available.
A table of results displays the training loss and validation accuracy for each trial.
While the experiment is running, click Training Plot to display the training plot and track the progress of each trial.
To find the best result for your experiment, sort the table of results by validation accuracy.
Point to the ValidationAccuracy column.
Click the triangle icon.
Select Sort in Descending Order.
The trial with the highest validation accuracy appears at the top of the results table.
To visually check if the network correctly identifies similar and dissimilar pairs:
Select the trial with the highest accuracy.
On the Experiment Manager toolstrip, click Export.
In the dialog window, enter the name of a workspace variable for the exported training output. The default name is
Test the network on a small batch of image pairs by calling the
displayTestSet function. Use the exported training output as the input to the function. For instance, in the MATLAB Command Window, enter:
The function displays 10 randomly selected pairs of test images with the prediction from the trained network, the probability score, and a label indicating whether the prediction is correct or incorrect.
To record observations about the results of your experiment, add an annotation.
In the results table, right-click the ValidationAccuracy cell of the best trial.
Select Add Annotation.
In the Annotations pane, enter your observations in the text box.
For more information, see Sort, Filter, and Annotate Experiment Results.
In the Experiment Browser pane, right-click the name of the project and select Close Project. Experiment Manager closes all of the experiments and results contained in the project.