Neural ODE Modeling of a Single-Link Robot with External Input

Question

I want to construct the NeuralODE model of a single-link robot which can be described as ODEs with external control input:

where  and  represent angle position and angle velocity of the robot, respectively. The  is the external input. 
 However, when I start to train the neuralODE model with external input, the loss function curve can not reduce to zero. I want to konw how to verify that the obtained training results are effective even though the loss function does not equal to 0.

% Data generation of 1DoF Robot
dt = 1e-3;
tspan = 0:dt:5;
x0 = [0;0]; % initial state
u = sin(2*pi*tspan); % external input

Params = struct;
Params.M = 1;
Params.m = 2;
Params.g = 9.8;
Params.L = 0.5;

[T,X] = ode45(@(t,x) robotOdes(t,x,Params),tspan,x0);

tiledlayout("flow");
ax1 = nexttile;
plot(ax1,T,X);
legend(ax1,["Position x_1","Velocity x_2"]);

function dX = robotOdes(t,X,P)
    dX = zeros(2,1);
    dX(1) = X(2);
    dX(2) = 1/P.M * sin(2*pi*t)  - 1/(2*P.M) * P.m * P.g * P.L * sin(X(1)); 
end



% Define and Initialize Model Parameters
xTrain = X.'; % training datas

neuralOdeTimesteps = 50;
timesteps = (0:neuralOdeTimesteps)*dt;

% Initialize the parameters structure.
neuralOdeParameters = struct;
stateSize = size(xTrain,1);
hiddenSize = 30;

neuralOdeParameters.fc1 = struct;
neuralOdeParameters.fc1.Weights = dlarray(randn([hiddenSize stateSize],"single"));
neuralOdeParameters.fc1.Bias = dlarray(zeros([hiddenSize 1],"single"));

neuralOdeParameters.fc2 = struct;
neuralOdeParameters.fc2.Weights = dlarray(randn([stateSize hiddenSize],"single"));
neuralOdeParameters.fc2.Bias = dlarray(zeros([stateSize 1],"single"));

neuralOdeParameters.fc3 = struct;
sz = [hiddenSize stateSize];
neuralOdeParameters.fc3.Weights = dlarray(randn([hiddenSize stateSize],"single"));
neuralOdeParameters.fc3.Bias = dlarray(zeros([hiddenSize 1],"single"));

neuralOdeParameters.fc4 = struct;
neuralOdeParameters.fc4.Weights = dlarray(randn([stateSize hiddenSize],"single"));
neuralOdeParameters.fc4.Bias = dlarray(zeros([stateSize 1],"single"));



% Specify training options
gradDecay = 0.9;
sqGradDecay = 0.99;
learnRate = 0.005;

numIter = 1200;
miniBatchSize = 100;

averageGrad = [];
averageSqGrad = [];

% monitor = trainingProgressMonitor(Metrics="Loss",Info=["Iteration","LearnRate"],XLabel="Iteration");

numTrainingTimesteps = size(xTrain,2);
trainingTimesteps = 1:numTrainingTimesteps;

% Training loop

iteration = 0;
ax2 = nexttile;
an = animatedline(ax2);
while iteration < numIter 
    iteration = iteration + 1;

    % Create batch
    [X0,Targets,uVal,uTime] = createMiniBatch(numTrainingTimesteps,neuralOdeTimesteps,miniBatchSize,xTrain,u,tspan);

    % Evaluate network and compute loss and gradients
    [loss,gradients] = dlfeval(@modelLoss,timesteps,X0,neuralOdeParameters,Targets,uVal,uTime);
    
    an.addpoints(iteration,extractdata(loss));
    
    % Update network
    [neuralOdeParameters,averageGrad,averageSqGrad] = adamupdate(neuralOdeParameters,gradients,averageGrad,averageSqGrad,iteration,...
        learnRate,gradDecay,sqGradDecay);

end



% Helper functions

function X = model(tspan,X0,neuralOdeParameters,U,T)

X = dlode45(@(t,y,p) odeModel(t,y,p,U,T),tspan,X0,neuralOdeParameters,DataFormat="CB");

end


function [loss,gradients] = modelLoss(tspan,X0,neuralOdeParameters,targets,U,T)

% Compute predictions.
X = model(tspan,X0,neuralOdeParameters,U,T);

% Compute L1 loss.
loss = l1loss(X,targets,DataFormat="CBT",NormalizationFactor="all-elements");

% Compute gradients.
gradients = dlgradient(loss,neuralOdeParameters);

end


function dy = odeModel(t,y,theta,uVal,uTime)

    f = tanh(theta.fc1.Weights*y + theta.fc1.Bias);
    f = tanh(theta.fc2.Weights*f + theta.fc2.Bias);

    g = tanh(theta.fc3.Weights*y + theta.fc3.Bias);
    g = tanh(theta.fc4.Weights*g + theta.fc4.Bias);

    u = zeros(size(uVal,[1 2])); % "CB" format

    for batchIdx = 1:size(u,2)
        % The size of control input is one
        u(:,batchIdx) = interp1(squeeze(uTime(:,batchIdx,:)),squeeze(uVal(:,batchIdx,:)),t,"linear","extrap");
    end

    dy = f + g.*u; % neuralODE with external input dy == f_(y) + g_(y)*u affine form
end


function [x0,targets,u,t] = createMiniBatch(numTimesteps,numTimesPerObs,miniBatchSize,X,U,T)
    
    % Create batches of trajectories.
    s = randperm(numTimesteps - numTimesPerObs, miniBatchSize);
    x0 = dlarray(X(:, s));

    targets = zeros([size(X,1) miniBatchSize numTimesPerObs]);
    u = zeros([size(U,1) miniBatchSize numTimesPerObs]);
    t = zeros([1 miniBatchSize numTimesPerObs]);

    for i = 1:miniBatchSize
        targets(:, i, 1:numTimesPerObs) = X(:, s(i) + 1:(s(i) + numTimesPerObs));
        u(:,i,1:numTimesPerObs) = U(:,s(i) + 1:(s(i) + numTimesPerObs));
        t(:,i,1:numTimesPerObs) = T(:,s(i) + 1:(s(i) + numTimesPerObs));
    end

end

Darshak · Accepted Answer

Hello @Chuguang Pan,
I don’t think there’s anything obviously wrong with your setup. For Neural ODEs with inputs, it’s quite common that the loss doesn’t go all the way to zero.
One reason is that both sides of the comparison are numerical. Your training data comes from ode45, and during training you’re solving another ODE with dlode45. Since both are adaptive solvers with tolerances, there’s usually a small mismatch that never completely disappears.
This shows up more when you have an external input. The solver evaluates the dynamics at time points it chooses internally, so the input has to be interpolated, and that adds a bit more approximation. In practice, that alone can keep the loss from going to zero even if the learned dynamics are reasonable.
Rather than focusing on the final loss value, I’d check a few practical things:
Roll the learned model forward from a state that wasn’t used for training and see how well the trajectory matches over time.
Try a different initial condition or a slightly different input signal and see if the behavior still makes sense.
For this system, it’s also worth checking simple structure (for example, whether the learned model roughly satisfies x˙1≈x2\dot{x}_1 \approx x_2x˙1​≈x2​).
If you really want to push the loss lower, tightening the solver tolerances when generating the training data and when calling dlode45 can help, but I wouldn’t treat a non‑zero loss here as a failure by itself.
You can refer to the following documentation, they will he helpful:
Neural ODE dynamics example: https://www.mathworks.com/help/deeplearning/ug/dynamical-system-modeling-using-neural-ode.html
dlode45 documentation: https://www.mathworks.com/help/deeplearning/ref/dlarray.dlode45.html
Neural ODEs with input signals (MATLAB Answers): https://www.mathworks.com/matlabcentral/answers/1635140-neural-ode-for-dynamic-systems-with-input-signals

Neural ODE Modeling of a Single-Link Robot with External Input

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

1 Comment
Show -1 older comments Hide -1 older comments

More Answers (0)

Categories

Products

Release

Tags

Community Treasure Hunt

Neural ODE Modeling of a Single-Link Robot with External Input

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

1 Comment Show -1 older comments Hide -1 older comments

More Answers (0)

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

1 Comment
Show -1 older comments Hide -1 older comments