Importing pre-trained recurrent network to reinforcement learning agent
Show older comments
Hello,
Are pre-trained recurrent networks re-initialized when used in agents for reinforment learning? If so, how can it be avoided?
I am importing a LSTM network trained using supervised training as the actor for a PPO agent. When simulating without training the reward is fine, however If the agent is trained the reward falls as if no pre-trained network was used. I would expect the reward to be similar or higher after training so presumably the network is being re-initialized, is there a way around it?
Thanks
% Load actor
load(netDir);
actorNetwork = net.Layers;
actorOpts = rlRepresentationOptions('LearnRate',learnRate);
actor = rlStochasticActorRepresentation(actorNetwork,obsInfo,actInfo,'Observation',{'input'},actorOpts);
% Create critic
criticNetwork = [sequenceInputLayer(numObs,"Name","input")
lstmLayer(numObs)
softplusLayer()
fullyConnectedLayer(1)];
criticOpts = rlRepresentationOptions('LearnRate',learnRate);
critic = rlValueRepresentation(criticNetwork,obsInfo,'Observation',{'input'},criticOpts);
% Create agent
agentOpts = rlPPOAgentOptions('ExperienceHorizon',expHorizon, 'MiniBatchSize',miniBatchSz, 'NumEpoch',nEpoch, 'ClipFactor', 0.1);
agent = rlPPOAgent(actor,critic,agentOpts);
% Train agent
trainOpts = rlTrainingOptions('MaxEpisodes',episodes, 'MaxStepsPerEpisode',episodeSteps, ...
'Verbose',false, 'Plots','training-progress', ...
'StopTrainingCriteria', 'AverageReward', ...
'StopTrainingValue',10);
% Run training
trainingStats = train(agent,env,trainOpts);
% Simulate
simOptions = rlSimulationOptions('MaxSteps',2000);
experience = sim(env,agent,simOptions);
Accepted Answer
More Answers (0)
Categories
Find more on Reinforcement Learning in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!