Is it possible to use the reinforcement learning toolbox in a Simulink/Adams co-simulation?
15 views (last 30 days)
Show older comments
chengye he
on 18 Dec 2020
Commented: xw x
on 3 Nov 2022
I gave my Adams model a step input in Simulink co-simulation. The co-simulation turned out to be just fine, with the animation being just as expected. After I tried to implement reinforcement learning on my model following this example https://www.mathworks.com/help/reinforcement-learning/ug/quadruped-robot-locomotion-using-ddpg-gent.html. I got this error in the picture below
So is it because there is something wrong with my coding or is it not possible to implement the reinforcement learning toolbox in co-simulation? Thank you!
2 Comments
xw x
on 3 Nov 2022
Hello, I also have the same doubt, may I ask if you have solved your problem? Although the delay module can make the training continue, the reward given by the environment is delayed and then input to the agent, is it correct? In addition, my training result is very unsatisfactory, is it the problem that appears here
xw x
on 3 Nov 2022
Hello, I also have the same doubt, may I ask if you have solved your problem? Although the delay module can make the training continue, the reward given by the environment is delayed and then input to the agent, is it correct? In addition, my training result is very unsatisfactory, is it the problem that appears here
Accepted Answer
Emmanouil Tzorakoleftherakis
on 18 Dec 2020
Hello,
You should be able to use Reinforcement Learning Toolbox for cosimulation. It looks like closing the loops with observations and rewards create algebraic loops somewhere. Since the ADAMS plant is within an S-function, I would check the connections between that s-function and the RL Agent (so observations, actions, reward). You should be able to get rid of the error by adding a delay block.
Please take a look at the following links that go over algebraic loops and how to remove them:
3 Comments
Emmanouil Tzorakoleftherakis
on 21 Dec 2020
Edited: Emmanouil Tzorakoleftherakis
on 21 Dec 2020
Hello,
I would say that the delay block should go at the reward signal, right before it enters the RL Agent block (possibly also in the other observations as well as the IsDone). If you delay the actions, it will likely mess up training.
More Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!