reinforcement learning agent simulation is not same with training agent

Question

0 votes

Hi, I trained reinforcement learning agent with simulink for humanoid robot and looking at the robot configuraiton output while learning, the result of 'Agent 7' was good, so I saved the agent and proceeded with the simulation like the code below.

agent = load('Agent7.mat');

simOptions = rlSimulationOptions('MaxSteps', 50);

experience = sim(env,agent.saved_agent,simOptions);

However, it was different from the configuration output of Agent 7 in the learning process, so the action graph was observed using the scope. Graph during learning and simulation were different.

first picture is learning confituration output(Looking at the robot from left side) and action

second picture is simulation configuration output and action

During learning, the action showed discrete values, and this result was somewhat satisfactory. However, when the agent was saved and then simulated, the action value were only constants.

Could you please give me answer to solve this problem?

(I use matlab 2020b)

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Emmanouil Tzorakoleftherakis on 30 Nov 2020

0 votes

Hello,

Please see this post that explains why simulation results may differ during training and after training.

One thing to consider as well in your case is how long you ran the training for. For example I see you are mentioning 'Agent7' - does this agent correspond to the policy after 7 episodes? If yes, it does not matter what the episode simulation showed, it is still too soon to converge to an acceptable policy. As I mention in the link above, what you see during training is a function of many things, including exploration of the agent.

Hope that helps

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

reinforcement learning agent simulation is not same with training agent

0 Comments
Show -2 older comments Hide -2 older comments

Answers (1)

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Tags

Community Treasure Hunt

reinforcement learning agent simulation is not same with training agent

0 Comments Show -2 older comments Hide -2 older comments

Answers (1)

0 Comments Show -2 older comments Hide -2 older comments

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments