RL designer toolbox | PPO agent | NaN output as policy

3 views (last 30 days)

Show older comments

Atusa on 16 Apr 2025

0
Link

Direct link to this question

https://uk.mathworks.com/matlabcentral/answers/2176378-rl-designer-toolbox-ppo-agent-nan-output-as-policy

Commented: praguna manvi on 22 Apr 2025

Open in MATLAB Online

Dear all,

I'm using a PPO RL agent in a simulink environment and I'm training the RL agent using the RL Designer toolbox.

I'm getting the following error in the middle of the training at various episodes:

Error:Block '.../agent Obj/Evaluate Policy/Execute Policy/Enabled Policy Evaluator/Policy Evaluator/Policy Process Experience Internal' outputs 'NaN' for element 1 of output port 1 at major time step 0.

My agent's action is one numeric value which should be bounded between [-1, 1] so there shouldn't be any NaN values.

I'm using a rlContinuousGaussianActor that has a softplusLayer in the last layer of the standard deviation output and a tanhLayer followed by a scalingLayer with Scale=1 in the outputlayer of the mean value for the action. I have also used the following command:

actInfo = rlNumericSpec([1 1], 'LowerLimit', -1, 'UpperLimit', 1);

I'm resetting my environment at each episode using the following command:

simEnv.ResetFcn = @(in) setVariable(in,"q",0,"Workspace",mdl);

I'm sure there's no singularity in my model as the smulation runs with no error. The error only happens during the training with the toolbox and it also doesn't always happen.

I'd appreciate it if you could hel me solve this issue.

Thank you very much.

1 Comment
Show -1 older commentsHide -1 older comments

praguna manvi on 22 Apr 2025

Hi @Atusa,

Can you share the Simulink model to reproduce this error.

Answers (0)

Products

Reinforcement Learning Toolbox

Release

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

RL designer toolbox | PPO agent | NaN output as policy

1 Comment
Show -1 older commentsHide -1 older comments

Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

RL designer toolbox | PPO agent | NaN output as policy

1 Comment Show -1 older commentsHide -1 older comments

Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

1 Comment
Show -1 older commentsHide -1 older comments