RL Water Tank example by MATLAB does not converge

Question

Alp on 8 Nov 2025 at 4:24

0
Link

Direct link to this question

https://uk.mathworks.com/matlabcentral/answers/2181119-rl-water-tank-example-by-matlab-does-not-converge

Commented: Alp on 12 Nov 2025 at 18:28

I am following the RL water tank control tutorial by MATLAB: https://www.mathworks.com/help/reinforcement-learning/ug/control-water-level-using-ddpg-agent.html (MATLAB R2025b)

However, even the model is learning at the beginning, towards the end of the training, Q0 value explodes and the reward drops from almost maximum to below zero. I need to obtain stable and good results with the official DDPG water tank control to use it as a baseline in my research, and hence, I prefer not to modify hyperparameters of the network, the reward function and the stopping criteria.

Is anyone able to reproduce good results using the given RL water tank example? Or is it okay if it is not stable in its default configuration?=

Here are my results:

And this is the start of the training, before Q value explodes:

Thank you.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

sneha on 12 Nov 2025 at 9:52

1
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/2181119-rl-water-tank-example-by-matlab-does-not-converge#answer_1571881

Hello,

Yes, this behaviour is normal. The official DDPG water tank example is mainly designed to demonstrate workflow, not guaranteed long-term stability. The Q-value explosion and reward drop occur due to stochastic exploration, critic overestimation, and function-approximation limits in standard DDPG. Even in the official setup, results can vary across runs. You can treat the provided pretrained agent (WaterTankDDPG.mat) as the validated baseline for comparison. It is acceptable if the model becomes unstable in default configuration, as consistent stability was not the primary goal of the example.

You can refer https://www.mathworks.com/help/reinforcement-learning/ug/ddpg-agents.html to know more about DDPG Training Algorithm and Actor and Critic Used by the DDPG Agents.

1 Comment
Show -1 older commentsHide -1 older comments

Alp on 12 Nov 2025 at 18:28

Thanks for your answer! This was very helpful.

Sign in to comment.

RL Water Tank example by MATLAB does not converge

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

RL Water Tank example by MATLAB does not converge

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments