LSTM network combined with BiLSTM

Question

Hi everyone,
I'm working on a LSTM network for sequence regression. I have a large set of time-sequences with 24 features as inputs and 3 as outputs. The input is totally numeric and sequences have a varying length from 10 steps to more than 1000. Right now, the architecture is pretty simple:

In order to improve performance and because of the functioning of the system, it would made sense if one input was processed by a BiLSTM layer because it can be known in advance as shown in the example below:

However, using two sequenceInputLayer is not allowed by MATLAB. By surfing the forum, it was suggested to use imageInputLayer in order to allow two of them. I failed even in this case by implementing the following architecture:

It follows the code I'm using:
inputLay = imageInputLayer([1 numInput-1 1],'Normalization','zscore','NormalizationDimension','channel','Name','input Layer');
lstm1 = lstmLayer(128,'OutputMode','sequence','Name','LSTM Layer 1');
fullyConn1 = fullyConnectedLayer(32,'Name','Fully Conn 1');
fullyConnOut = fullyConnectedLayer(numOutput,'Name','Fully Conn Out');
outputLay = regressionLayer('Name','Output Layer');
concat = concatenationLayer(1,numInput,'Name','concat');
inputLayCurv = imageInputLayer([1 1 1],'Normalization','none','Name','curvature');
blstm1 = bilstmLayer(8,'OutputMode','sequence','Name','BiLSTM Layer 1');
flatten1=flattenLayer('Name','flatten1');
flatten2=flattenLayer('Name','flatten2');

% Architecture
layers=[
    inputLay
    flatten1
    lstm1
    concat
    fullyConn1
    fullyConnOut
    outputLay
    ];
lgraph = layerGraph(layers);

lgraph = addLayers(lgraph,inputLayCurv);
lgraph = addLayers(lgraph,flatten2);
lgraph = addLayers(lgraph,blstm1);
lgraph = connectLayers(lgraph,'curvature','flatten2');
lgraph = connectLayers(lgraph,'flatten2','BiLSTM Layer 1');
lgraph = connectLayers(lgraph,'BiLSTM Layer 1','concat/in2');
It worth noting that training, validation and test sets are combinedDatastore given by two arrayDatastore where, for the first one, the first 23 inputs have to go in the 'input Layer' whereas the last input in 'curvature'. My question is if I'm doing something wrong regarding the architecture or if my idea can't be implemented in MATLAB. Thanks in advance to everyone for the help.

Luca Reali · Accepted Answer

The most logical way to solve the problem was to split data from the sequential input layer by means of two FullyConnectedLayer with custom weights and both WeightLearnFactor and BiasLearnFactor null. An example follows:

inputLay = sequenceInputLayer(numInput,'Normalization','zscore','NormalizationDimension','channel','Name','input Layer');
fullyConn1 = fullyConnectedLayer(numInput-1,'Name','Fully Conn 1');
fullyConn2 = fullyConnectedLayer(1,'Name','Fully Conn 2');
lstm1 = lstmLayer(64,'OutputMode','sequence','Name','LSTM Layer 1');
fullyConn3 = fullyConnectedLayer(32,'Name','Fully Conn 3');
fullyConnOut = fullyConnectedLayer(numOutput,'Name','Fully Conn Out');
outputLay = regressionLayer('Name','Output Layer');
blstm1 = bilstmLayer(8,'OutputMode','sequence','Name','BiLSTM Layer 1');
concat = concatenationLayer(1,2,'Name','concat');
relu1 = reluLayer('Name','ReLU 1');
relu2 = reluLayer('Name','ReLU 2');

% Options for fully connected layers that split 6-th input
weights1=eye(numInput-1,numInput);
zerocolumn=zeros(numInput-1,1);
weights1=[weights1(:,1:5), zerocolumn ,weights1(:,6:end-1)];
fullyConn1.Weights=weights1;
fullyConn1.WeightLearnRateFactor=0;
fullyConn1.BiasLearnRateFactor=0;

weights2=zeros(1,numInput);
weights2(6)=1;
fullyConn2.Weights=weights2;
fullyConn2.WeightLearnRateFactor=0;
fullyConn2.BiasLearnRateFactor=0;

% Architecture
branch1=[
    fullyConn1
    lstm1
    relu1
    ];

branch2=[
    fullyConn2
    blstm1
    relu2
    ];

outlayers=[
    concat
    fullyConn3
    fullyConnOut
    outputLay
    ];

branch1graph = layerGraph(branch1);
branch2graph = layerGraph(branch2);
outlayersgraph = layerGraph(outlayers);
lgraph = addLayers(branch1graph,branch2graph.Layers);
lgraph = addLayers(lgraph,outlayersgraph.Layers);
lgraph = addLayers(lgraph,inputLay);

lgraph = connectLayers(lgraph,'input Layer','Fully Conn 1');
lgraph = connectLayers(lgraph,'input Layer','Fully Conn 2');
lgraph = connectLayers(lgraph,'ReLU 1','concat/in1');
lgraph = connectLayers(lgraph,'ReLU 2','concat/in2');

LSTM network combined with BiLSTM

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

0 Comments
Show -2 older comments Hide -2 older comments

More Answers (0)

Categories

Products

Release

Tags

Community Treasure Hunt

LSTM network combined with BiLSTM

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

0 Comments Show -2 older comments Hide -2 older comments

More Answers (0)

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments