How should I fix the error 'too many outputs argument'?

9 views (last 30 days)
I made the cord using by reinforcement learning tool.
train_agent
Error: reset
There are too many output arguments.
Error: rl.env.MATLABEnvironment / simLoop (line 235)
observation = reset (env);
Error: rl.env.MATLABEnvironment / simWithPolicyImpl (line 106)
[expcell {simCount}, epinfo, siminfos {simCount}] = simLoop (env, policy, opts, simCount, usePCT);
Error: rl.env.AbstractEnv / simWithPolicy (line 83)
[experiences, varargout {1: (nargout-1)}] = simWithPolicyImpl (this, policy, opts, varargin {:});
Error: rl.task.SeriesTrainTask / runImpl (line 33)
[varargout {1}, varargout {2}] = simWithPolicy (this.Env, this.Agent, simOpts);
Error: rl.task.Task / run (line 21)
[varargout {1: nargout}] = runImpl (this);
Error: rl.task.TaskSpec / internal_run (line 166)
[varargout {1: nargout}] = run (task);
Error: rl.task.TaskSpec / runDirect (line 170)
[this.Outputs {1: getNumOutputs (this)}] = internal_run (this);
Error: rl.task.TaskSpec / runScalarTask (line 194)
runDirect (this);
Error: rl.task.TaskSpec / run (line 69)
runScalarTask (task);
Error: rl.train.SeriesTrainer / run (line 24)
run (series taskspec);
Error: rl.train.TrainingManager / train (line 424)
run (trainer);
Error: rl.train.TrainingManager / run (line 215)
train (this);
Error: rl.agent.AbstractAgent / train (line 77)
TrainingStatistics = run (trainMgr);
Error: train_agent (line 90)
trainingStats = train (agent, env, trainingOptions);
But the above error happed.
How should I fix it and, please teach me the way to check thet outputs argument and its number.
% DDPG エージェントのトレーニング
% 環境の設定
env = Environment;
obsInfo = env.getObservationInfo;
actInfo = env.getActionInfo;
numObs = obsInfo.Dimension(1); % 2
numAct = numel(actInfo); % 1
% CRITIC
statePath =[
featureInputLayer(numObs, 'Normalization','none','Name','observation')
fullyConnectedLayer(128, 'Name','CriticStateFC1')
reluLayer('Name','CriticRelu1')
fullyConnectedLayer(200,'Name','CriticStateFC2')];
actionPath = [
featureInputLayer(numAct,'Normalization','none','Name','action')
fullyConnectedLayer(200,'Name','CriticActionFC1','BiasLearnRateFactor', 0)];
commonPath = [
additionLayer(2,'Name','add')
reluLayer('Name','CriticCommonRelu')
fullyConnectedLayer(1,'Name','CriticOutput')];
criticNetwork = layerGraph(statePath);
criticNetwork = addLayers(criticNetwork, actionPath);
criticNetwork = addLayers(criticNetwork, commonPath);
criticNetwork = connectLayers(criticNetwork,'CriticStateFC2','add/in1');
criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');
criticOptions = rlRepresentationOptions('LearnRate',1e-03,'GradientThreshold',1);
critic = rlQValueRepresentation(criticNetwork,obsInfo,actInfo,'Observation',{'observation'},'Action',{'action'},criticOptions);
% ACTOR
actorNetwork = [
featureInputLayer(numObs,'Normalization','none','Name','observation')
fullyConnectedLayer(128,'Name','ActorFC1')
reluLayer('Name','ActorRelu1')
fullyConnectedLayer(200,'Name','ActorFC2')
reluLayer('Name','ActorRelu2')
fullyConnectedLayer(1,'Name','ActorFC3')
tanhLayer('Name','ActorTanh1')
scalingLayer('Name','ActorScaling','Scale',max(actInfo.UpperLimit))];
actorOptions = rlRepresentationOptions('LearnRate',5e-04,'GradientThreshold',1);
actor= rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo,'Observation',{'observation'},'Action',{'ActorScaling'},actorOptions);
% エージェントオプション
agentOptions = rlDDPGAgentOptions(...
'SampleTime',env.Ts,...
'TargetSmoothFactor',1e-3,...
'ExperienceBufferLength',1e6,...
'MiniBatchSize',128);
% ノイズ
agentOptions.NoiseOptions.Variance = 0.4;
agentOptions.NoiseOptions.VarianceDecayRate = 1e-5;
agent = rlDDPGAgent(actor,critic,agentOptions);
% トレーニングオプション
maxepisodes = 20000;
maxsteps = 1e8;
trainingOptions = rlTrainingOptions(...
'MaxEpisodes',maxepisodes,...
'MaxStepsPerEpisode',maxsteps,...
'Verbose',false,...
'Plots','training-progress',...
'StopOnError','on',...
'StopTrainingCriteria','AverageReward',...
'StopTrainingValue',Inf,...
'ScoreAveragingWindowLength',10);
% 描画の環境
%plot(env);
% トレーニングエージェント
trainingStats = train(agent,env,trainingOptions); %% ← the error happend here.
% シミュレーション エージェントのトレーニング
simOptions = rlSimulationOptions('MaxSteps',maxsteps);
experience = sim(env,agent,simOptions);

Answers (1)

Ronit
Ronit on 7 Nov 2024 at 6:34
Edited: Ronit on 7 Nov 2024 at 7:23
The error seems to originate from the "reset" function within your reinforcement learning environment. The "reset" function must return two outputs: "InitialObservation" and "Info". This is necessary for the "sim" and "train" functions to properly initialize the environment at the start of each simulation or training episode.
Ensure that the calling code captures both outputs. The function call should be structured as follows:
[InitialObservation, Info] = reset(env);
Note: Ensure your "reset" function is implemented to return both "InitialObservation" and "Info" as specified.
Refer to the following MATLAB documentation link for more deatils:
I hope it helps resolve your query!

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!