How to set boundaries for action in reinforcement leaning?

Question

Keqiao Wu on 23 Oct 2021

1
Link

Direct link to this question

https://uk.mathworks.com/matlabcentral/answers/1570093-how-to-set-boundaries-for-action-in-reinforcement-leaning

Answered: Umeshraja on 9 Jun 2025

myEnvClass.m

There are 3 actions in my environment and the boundaries of them are from [1; 1; 0] to [5; 5; 1]. The codes are as follows:

function this = myEnvClass()
            % Initialize Observation settings
            ObservationInfo = rlNumericSpec([9 1]);
            ObservationInfo.Name = 'ASV States';
            %ObservationInfo.Description = 'x, dx, theta, dtheta';
            ObservationInfo.Description = 'dx, dy, dz,dl,vx,vy,vz,phi,theta';
            
            % Initialize Action settings   
            ActionInfo = rlNumericSpec([3 1 1], 'LowerLimit',[1;1;0], 'UpperLimit',[5;5;1]);     
            ActionInfo.Name = 'ASV Action';
            ActionInfo.Description = 'rho,sigma,theta';
            
            % The following line implements built-in functions of RL env
            this = this@rl.env.MATLABEnvironment(ObservationInfo,ActionInfo);
            
            % Initialize property values and pre-compute necessary values
            updateActionInfo(this);
%             this.State = [400 400 -50 0 0 0 0 0 0]';
            
end

and the codes of updateActionInfo function are as follow:

function updateActionInfo(this)
%              this.ActionInfo.Elements = this.MaxAngle*[-1 1];  
            this.ActionInfo = rlNumericSpec([3 1 1], 'LowerLimit',[1;1;0], 'UpperLimit',[5;5;1]);  
            this.ActionInfo.Name = 'ASV Action';
            this.ActionInfo.Description = 'rho,sigma,theta';
            
end

But when I trained the agent(PPO), the actions in step fucntion were always far greater or far less than the boundary value. For example, action = [144, 152, -63], action = [1608, -1463, -598].

I attached my myEnvClass.m, would someone please help me?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Umeshraja on 9 Jun 2025

0
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/1570093-how-to-set-boundaries-for-action-in-reinforcement-leaning#answer_1566136

Open in MATLAB Online

Hi @Keqiao Wu,

I understand you're encountering an issue where the PPO agent produces actions that exceed the specified bounds, even though you've defined the action limits using rlNumericSpec in MATLAB's Reinforcement Learning Toolbox.

It's important to note that for PPO agents, the LowerLimit and UpperLimit properties in rlNumericSpec are treated as metadata—they're not enforced automatically by the agent. This behavior is documented here:

https://www.mathworks.com/help/reinforcement-learning/ref/rl.util.rlnumericspec.html

In contrast, agents like DDPG, TD3, and SAC do perform automatic clipping to ensure actions stay within the specified limits.

To resolve this for PPO, you can either:

Normalize the action space to [-1, 1], then manually scale and/or clip the actions before applying them in the environment, or
Always clip the transformed action before passing it to the environment.

Here’s an example:

% Assume agent outputs actions in [-1, 1]
scaledAction = zeros(3,1);
scaledAction(1) = (Action(1) + 1) * 2 + 1; % Maps [-1,1] to [1,5]
scaledAction(2) = (Action(2) + 1) * 2 + 1; % Maps [-1,1] to [1,5]
scaledAction(3) = (Action(3) + 1) * 0.5;   % Maps [-1,1] to [0,1]
% Clip to ensure within bounds
scaledAction(1) = min(max(scaledAction(1), 1), 5);
scaledAction(2) = min(max(scaledAction(2), 1), 5);
scaledAction(3) = min(max(scaledAction(3), 0), 1);

Hope this helps!

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

How to set boundaries for action in reinforcement leaning?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

How to set boundaries for action in reinforcement leaning?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments