Example of using Self attention layer in MATLAB R2023A

IN MATLAB 2023A, self-attention layer is intorduced.
can an example is provided to use it in image classication tasks?

2 Comments

Same question, can there be an example about time series forecasting? Thanks !!

Sign in to comment.

 Accepted Answer

Hi Mahmoud,
I understand that you want to use "selfAttentionLayer" for image classification task in MATLAB.
A self-attention layer computes single-head or multihead self-attention of its input. For the following example, we will be using the "DigitDataset" in MATLAB.
% load digit dataset
digitDatasetPath = fullfile(matlabroot, 'toolbox', 'nnet', 'nndemos', 'nndatasets', 'DigitDataset');
imds = imageDatastore(digitDatasetPath, ...
'IncludeSubfolders', true, 'LabelSource', 'foldernames');
[imdsTrain, imdsValidation] = splitEachLabel(imds, 0.7, 'randomized');
% define network architecture
layers = [
imageInputLayer([28 28 1], 'Name', 'input')
convolution2dLayer(3, 32, 'Padding', 'same', 'Name', 'conv1')
batchNormalizationLayer('Name', 'bn1')
reluLayer('Name', 'relu1')
maxPooling2dLayer(2, 'Stride', 2, 'Name', 'maxpool1')
convolution2dLayer(3, 64, 'Padding', 'same', 'Name', 'conv2')
batchNormalizationLayer('Name', 'bn2')
reluLayer('Name', 'relu2')
maxPooling2dLayer(2, 'Stride', 2, 'Name', 'maxpool2')
flattenLayer('Name', 'flatten')
selfAttentionLayer(8, 64, 'Name', 'self_attention')
fullyConnectedLayer(10, 'Name', 'fc')
softmaxLayer('Name', 'softmax')
classificationLayer('Name', 'output')]
% set training options
options = trainingOptions('sgdm', ...
'InitialLearnRate', 0.01, ...
'MaxEpochs', 5, ...
'Shuffle', 'every-epoch', ...
'ValidationData', imdsValidation, ...
'ValidationFrequency', 30, ...
'Verbose', false, ...
'Plots', 'training-progress')
% training the network
net = trainNetwork(imdsTrain, layers, options);
Training Output:
In this code, the selfAttentionLayer is used to processes 28x28 grayscale images. The self-attention mechanism helps the model capture long-range dependencies in the input data, meaning it can learn to relate different parts of the image to each other. By introducing the selfAttentionLayer after a series of convolutional and pooling layers, the model can enhance its feature representation capabilities by considering spatial relationships between different regions of the input image.
You can refer to the below documentation to understand more about creating and training a simple convolutional neural network for deep learning classification.

7 Comments

Dear Himanshu,
Many thanks for your help, could we use the self attention for the case of 1-D input signal (sequenceInputLayer)?
Regards
Mohamed
A self-attention layer computes single-head or multihead self-attention of its input.
The layer:
  1. Computes the queries, keys, and values from the input
  2. Computes the scaled dot-product attention across heads using the queries, keys, and values
  3. Merges the results from the heads
  4. Performs a linear transformation on the merged result
I wonder if the layer also apply softmax to the scaling (i.e. divide (Q*K) by sqrt(dim))? My understanding is that, within step 2, this softmax and scaling should happen.
Please clarify that for me or more general users.
Thanks.
can an example is provided to use it in time sequence data? Means to select the more valuable features from the input. and also how to use it with LSTM.
@Muhammad Shoaib ,@Himanshu I have tryed use selfAttentionLayer in time sequence data in R2023b,but faild! please see follow link, is there any idea?
Posted as a comment-as-flag by chang gao:
Useful answer.
For time series data, you could take a look at this recent blog post and GitHub repo. That uses a transformer network containing selfAttentionLayer for time series prediction. The use case there is finance, but the DL techniques would be generally applicable.
This answer is very helpful to me, but if it is an RGB image, how should I adjust the program? Can you give me some guidance?

Sign in to comment.

More Answers (0)

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Asked:

on 21 Mar 2023

Commented:

on 25 May 2024

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!