Create and Train NARX Network for Time Series Forecasting

Since R2025a

This example uses:

This example shows how to create and train a NARX network for time series forecasting and test its predictions in MATLAB® and Simulink®.

Nonlinear Autoregressive Exogenous Model (NARX) is a type of neural network architecture designed to handle nonlinear relationships in time series data. In this example, you:

Create a NARX network to predict the position of a magnet in a magnetic levitation system.
Prepare the training data and use it to train the network.
Evaluate the performance of the NARX network on a test data set.
Export the trained network to Simulink.

Load Data

The MagLev data set describes the position of a magnet suspended above an electromagnet and the current flowing through the electromagnet, as shown in the following figure. The magnet is constrained so that it can only move in the vertical direction.

The equation of motion for this system is

$\frac{d^{2} y (t)}{d t^{2}} = - g + \frac{α}{M} \frac{i (t)^{2}}{y (t)} - \frac{β}{M} \frac{d y (t)}{d t}$

where $y (t)$ is the distance of the magnet above the electromagnet, $i (t)$ is the current flowing in the electromagnet, $M$ is the mass of the magnet, and $g$ is the gravitational constant. The parameter $β$ is a viscous friction coefficient that is determined by the material in which the magnet moves, and $α$ is a field strength constant that is determined by the number of turns of wire on the electromagnet and the strength of the magnet.

The data set contains two time series, the current and the magnet position. Load the data set and extract the data from cell arrays.

[current,position] = maglev_dataset;
current = [current{:}]';
position = [position{:}]';

Plot the data.

figure
tiledlayout(2,1)
nexttile
plot(current)
xlabel("Time Step")
ylabel("Current")
nexttile
plot(position)
xlabel("Time Step")
ylabel("Magnet Position")

Define NARX Network Architecture

NARX networks are a type of neural network architecture designed to handle nonlinear relationships in time series data. They are an extension of traditional autoregressive models by incorporating exogenous inputs, which are external factors that can influence the system. Mathematically, a NARX model can be expressed as

$y (t) = f (y (t - 1), y (t - 2), \dots, y (t - n), u (t - 1), u (t - 2), \dots, u (t - m))$

where $y (t)$ is the output of the sequence at time step $t$ , $u (t)$ is the exogenous input, and $n$ and $m$ are the number of delay steps considered for the sequence and the exogenous input respectively, i.e. the total number of past values considered in every prediction. When applied to the magnetic levitation system, the output $y (t)$ corresponds to the position of the magnet, the exogenous input $u (t)$ corresponds to the current.

In this example, you use a delay of two for the magnet position and the current. Therefore, the model becomes:

$y (t) = f (y (t - 1), y (t - 2), u (t - 1), u (t - 2))$ .

This diagram illustrates the architecture of a simple NARX network, showing the inputs and outputs for this problem.

To create this NARX network, create a layer array containing a sequence input layer and a 1-D convolution layer. This layer array corresponds to the branch that receives the previous values in the sequence (the magnet position).

Set the size of the sequence input layer to the number of channels of the input data. Specify the size of the convolution filters as equal to the delay.

inputSizeSequence = 1;
numSequenceDelays = 2;
numFilters = 15;

sequenceLayers = [
    sequenceInputLayer(inputSizeSequence,Name="sequenceInput",MinLength=numSequenceDelays)
    convolution1dLayer(numSequenceDelays,numFilters,Name="conv1")];

NARXNet = dlnetwork;
NARXNet = addLayers(NARXNet,sequenceLayers);

Create a layer array containing a sequence input layer, a 1-D convolution layer, a concatenation layer, a tanh layer, and a fully connected layer. This layer array corresponds to the branch that receives the exogenous inputs (the current) and the final three layers.

Set the size of the sequence input layer to the number of channels of the input data. Specify the size of the convolution filters as equal to the delay.

inputSizeExogenous = 1;
numExogenousDelays = 2;
outputSize = 1;

exogenousLayers = [
    sequenceInputLayer(inputSizeExogenous,Name="exogenousInput",MinLength=numExogenousDelays)
    convolution1dLayer(numExogenousDelays,numFilters,Name="conv2")
    concatenationLayer(1,2,Name="cat")
    tanhLayer
    fullyConnectedLayer(outputSize)];

Add these layers to the network and connect the 1-D convolution layer to the second input of the concatenation layer.

NARXNet = addLayers(NARXNet,exogenousLayers);
NARXNet = connectLayers(NARXNet,"conv1","cat/in2");

Training a NARX network can result in better accuracy when when the learnable parameters have data type double. Initialize the learnable parameters and convert them to double using the dlupdate function.

NARXNet = initialize(NARXNet);
NARXNet = dlupdate(@double,NARXNet);

Split Training, Validation and Test Data

Partition the data into training, validation, and test data sets. Use the first 60% of the data for training, the next 20% of the data for testing, and the last 20% of the data for validation.

trainSplit = 0.6;
testSplit = 0.2;
inputSeqLength = numel(current)

inputSeqLength = 
4001

numTrain = floor(inputSeqLength * trainSplit)

numTrain = 
2400

numTest = floor(inputSeqLength * testSplit)

numTest = 
800

numVal = inputSeqLength - numTrain - numTest

numVal = 
801

For the training data, the predictors positionTrain and currentTrain contain the 1st to 2399th data points of the magnet position and current respectively, while the targets TTrain contain the 3rd to 2400th data points of the magnet position.

numDelays = max(numExogenousDelays,numSequenceDelays);

positionTrain = position(1+numDelays-numSequenceDelays:numTrain-1);
currentTrain = current(1+numDelays-numExogenousDelays:numTrain-1);
TTrain = position(1+numDelays:numTrain);

When training neural networks, it often helps to make sure that your data is normalized in all stages of the network. Normalization helps stabilize and speed up network training using gradient descent.

Determine the mean and standard deviation of the position and current training data.

muPosition = mean(positionTrain);
sigmaPosition = std(positionTrain);
muCurrent = mean(currentTrain);
sigmaCurrent = std(currentTrain);

Normalize the training predictors and targets by subtracting the mean and scaling by the standard deviation.

currentTrain = (currentTrain - muCurrent) / sigmaCurrent;
positionTrain = (positionTrain - muPosition) / sigmaPosition;
TTrain = (TTrain - muPosition) / sigmaPosition;

To train a network with multiple inputs using the trainnet function, create a single datastore that contains the training predictors and targets. To convert numeric arrays to datastores, use arrayDatastore. Then, use the combine function to combine them into a single datastore.

NARXTrain = combine(arrayDatastore(positionTrain,IterationDimension=2),arrayDatastore(currentTrain,IterationDimension=2),arrayDatastore(TTrain,IterationDimension=2));

For the validation data predictions positionVal and currentVal, use the 3201st to 4000th data points of the magnet position and current respectively. For the validation targets TVal, use the 3203rd to 4001st data points of the magnet position.

positionVal = position(numTrain+numTest+1+numDelays-numSequenceDelays:end-1);
currentVal = current(numTrain+numTest+1+numDelays-numExogenousDelays:end-1);
TVal = position(numTrain+numTest+1+numDelays:end);

Normalize the validation predictors and targets by subtracting the mean and scaling by the standard deviation.

positionVal = (positionVal- muPosition) / sigmaPosition;
currentVal = (currentVal - muCurrent) / sigmaCurrent;
TVal = (TVal - muPosition) / sigmaPosition;

Create a single datastore that contains the validation predictors and targets.

NARXVal = combine(arrayDatastore(positionVal,IterationDimension=2),arrayDatastore(currentVal,IterationDimension=2),arrayDatastore(TVal,IterationDimension=2));

For the test data, the inputs positionTest and currentTest contain the 2401st to 3199th data points, while the targets TTest contain the 2403rd to 3200th data points.

positionTest = position(numTrain+1+numDelays-numSequenceDelays:numTrain+numTest-1);
currentTest = current(numTrain+1+numDelays-numExogenousDelays:numTrain+numTest-1);
TTest = position(numTrain+1+numDelays:numTrain+numTest);

Normalize the test predictors and targets by subtracting the mean and scaling by the standard deviation.

positionTest = (positionTest - muPosition) / sigmaPosition;
currentTest = (currentTest - muCurrent) / sigmaCurrent;
TTest = (TTest- muPosition) / sigmaPosition;

Create a single datastore that contains the test predictors and responses.

NARXTest = combine(arrayDatastore(positionTest,IterationDimension=2),arrayDatastore(currentTest,IterationDimension=2),arrayDatastore(TTest,IterationDimension=2));

Train Network

Specify the training options.

Use the Levenberg–Marquardt (LM) solver. LM is a suitable solver for small networks and data sets that can be processed in a single batch.
Plot the training progress and disable verbose output.
Specify the validation data.
Set the maximum number of iterations to 500 and the relative gradient tolerance and step size tolerance to $10^{- 8}$ to give the network sufficient time to train.

options = trainingOptions("lm", ...
    Plots="training-progress", ...
    Verbose=false, ...
    ValidationData=NARXVal, ... 
    MaxIterations=500, ...
    GradientTolerance=1e-8, ...
    StepTolerance=1e-8);

Train the network using the trainnet function. By default, trainnet uses a GPU if one is available; otherwise, it uses a CPU. Training on a GPU requires Parallel Computing Toolbox™ and a supported GPU device. For information on supported devices, see GPU Computing Requirements (Parallel Computing Toolbox). You can also specify the execution environment by using the ExecutionEnvironment name-value argument of trainingOptions.

trainedNARXNet = trainnet(NARXTrain,NARXNet,"mse",options);

Test Network

Forecast the magnet position using the trained networks. In this example, both open-loop and closed-loop forecasting will be performed. Open-loop forecasting involves using actual observations as inputs to predict future values. Closed-loop forecasting, on the other hand, uses the model's own predictions as inputs for future forecasts.

Make open-loop predictions using the Make Predictions Using dlnetwork Object function and provide the entire test data set as input.

YNARXOpen = predict(trainedNARXNet,positionTest,currentTest);

To test the network's ability to make closed-loop predictions:

Create an array containing only the first two elements of the magnet position test data, and use that array to make an initial prediction.
Add the prediction to the end of the array of magnet positions.
Iteratively make further predictions using the last two elements of the array.
Remove the first two elements from the array.

YNARXClosed = positionTest(1:numSequenceDelays);

for idx = 1:numTest-numDelays
    YNARXClosed(end+1) = predict(trainedNARXNet,YNARXClosed(end-numSequenceDelays+1:end),currentTest(idx:idx+numExogenousDelays-1));
end
YNARXClosed = YNARXClosed(numDelays+1:end);

Compare the predictions to the targets.

positionTrainTest = (position(1:numTrain+numTest) - muPosition) / sigmaPosition;

figure
plot(positionTrainTest)
hold on
plot((1:numel(YNARXOpen))+numTrain+numDelays,YNARXOpen,"--")
plot((1:numel(YNARXClosed))+numTrain,YNARXClosed,"--")
xlabel("Time Step")
ylabel("Magnet Position (normalized)")
title("NARX Network Predictions")
legend("Targets","Open-Loop Predictions","Closed-Loop Predictions",Location="northwest")

Calculate the root mean squared error (RMSE) of the predictions.

rmseNARXOpen = rmse(YNARXOpen,TTest)

rmseNARXOpen = 
0.0030

rmseNARXClosed = rmse(YNARXClosed,TTest)

rmseNARXClosed = 
0.9888

Save the trained network.

save("trainedNARXNet.mat","trainedNARXNet")

Test Network in Simulink

Test the performance of the trained network in Simulink using the NARXNet model. The model makes closed-loop predictions of the magnet position by taking the exogenous current signal as input, and using delayed predictions as input to the network. The model incorporates the trained network as a Predict block. You can add a Predict block in Simulink and specify the trained network to use, or export a trained network using the Deep Network Designer app.

Open the Simulink model.

mdl = "NARXNetModel";
open_system(mdl)

NARX network system with current input port and magnet position output port.

The model uses Unit Delay (Simulink) blocks to delay both inputs to the network. The output of a Unit Delay block for the first sampling period is zero by default.

Set the initial condition of the Unit Delay blocks for the sequence input using the first two elements of the test position data.

set_param(mdl+"/Unit Delay","InitialCondition",string(positionTest(2)))
set_param(mdl+"/Unit Delay1","InitialCondition",string(positionTest(2)))
set_param(mdl+"/Unit Delay2","InitialCondition",string(positionTest(1)))

Set the initial condition of the Unit Delay block for the exogenous input using the element of the test current data.

set_param(mdl+"/Unit Delay3","InitialCondition",string(currentTest(1)))

Remove the first element from the test current data array (as it is already provided by the initial condition of the Unit Delay block) and create a corresponding vector of time steps. The model takes the variables X1NARXTest and t as input.

currentTest = currentTest(2:end);
t = (0:1:numel(currentTest)-1)';

Run the model.

simOut = sim(mdl);

Extract the results of the simulation. Discard the last element as there is no data in the test target data set to test its accuracy.

YNARXSimClosed = simOut.yout{1}.Values.Data(1:end-1);

Calculate the RMSE of the predictions. The RMSE matches the RMSE of the closed-loop predictions made in MATLAB.

rmseNARXSimClosed = rmse(YNARXSimClosed,TTest)

rmseNARXSimClosed = 
0.9888