The reward gets stuck on a single value during training or randomly fluctuates (Reinforcement Learning)
Show older comments
I train the reinforcement learning system, and on the reward plot I have some failures during which the reward does not change. This doesn’t look normal, especially when compared with examples (Biped Robot, etc.) I believe that some rlDDPGAgentOptions settings are responsible for this, but it seems that I changed all the possible settings, but even after several thousand episodes, the system does not learn. What can be the reason for this behavior of this graph during training?

Accepted Answer
More Answers (0)
Categories
Find more on Reinforcement Learning in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!