Specify custom reinforcement learning environment dynamics using functions
rlFunctionEnv to define a custom reinforcement learning
environment. You provide MATLAB® functions that define the step and reset behavior for the environment. This
object is useful when you want to customize your environment beyond the predefined
environments available with
obsInfo— Observation specification
rlNumericSpecobject | array
StepFcn— Step behavior for the environment
Step behavior for the environment, specified as a function name, function handle, or anonymous function.
StepFcn is a function that you provide which describes how the
environment advances to the next state from a given action. When using a function name
or function handle, this function must have two inputs and four outputs, as illustrated
by the following signature.
[Observation,Reward,IsDone,LoggedSignals] = myStepFunction(Action,LoggedSignals)
To use additional input arguments beyond the required set, specify
StepFcn using an anonymous function handle.
The step function computes the values of the observation and reward for the given action in the environment. The required input and output arguments are as follows.
Action — Current action, which must match the dimensions and
data type specified in
Observation — Returned observation, which must match the
dimensions and data types specified in
Reward — Reward for the current step, returned as a scalar
IsDone — Logical value indicating whether to end the
simulation episode. The step function that you define can include logic to decide
whether to end the simulation based on the observation, reward, or any other values.
LoggedSignals — Any data that you want to pass from one step
to the next, specified as a structure.
For an example showing multiple ways to define a step function, see Create MATLAB Environment Using Custom Functions.
ResetFcn— Reset behavior for the environment
Reset behavior for the environment, specified as a function, function handle, or anonymous function handle.
The reset function that you provide must have no inputs and two outputs, as illustrated by the following signature.
[InitialObservation,LoggedSignals] = myResetFunction
To use input arguments with your reset function, specify
ResetFcn using an anonymous function handle.
The reset function sets the environment to an initial state and computes the initial values of the observation signals. For example, you can create a reset function that randomizes certain state values, such that each training episode begins from different initial conditions.
InitialObservation output must match the dimensions and data
To pass information from the reset condition into the first step, specify that
information in the reset function as the output structure
For an example showing multiple ways to define a reset function, see Create MATLAB Environment Using Custom Functions.
LoggedSignals— Information to pass to next step
Information to pass to the next step, specified as a structure. When you create the
environment, whatever you define as the
LoggedSignals output of
ResetFcn initializes this property. When a step occurs, the
software populates this property with data to pass to the next step, as defined in
|Obtain action data specifications from reinforcement learning environment or agent|
|Obtain observation data specifications from reinforcement learning environment or agent|
|Train reinforcement learning agents within a specified environment|
|Simulate trained reinforcement learning agents within specified environment|
|Validate custom reinforcement learning environment|
Create a reinforcement learning environment by supplying custom dynamic functions in MATLAB®. Using
rlFunctionEnv, you can create a MATLAB reinforcement learning environment from an observation specification, action specification, and
reset functions that you define.
For this example, create an environment that represents a system for balancing a cart on a pole. The observations from the environment are the cart position, cart velocity, pendulum angle, and pendulum angle derivative. (For additional details about this environment, see Create MATLAB Environment Using Custom Functions.) Create an observation specification for those signals.
oinfo = rlNumericSpec([4 1]); oinfo.Name = 'CartPole States'; oinfo.Description = 'x, dx, theta, dtheta';
The environment has a discrete action space where the agent can apply one of two possible force values to the cart, –10 N or 10 N. Create the action specification for those actions.
ActionInfo = rlFiniteSetSpec([-10 10]); ActionInfo.Name = 'CartPole Action';
Next, specify the custom
reset functions. For this example, use the supplied functions
myStepFunction.m. For details about these functions and how they are constructed, see Create MATLAB Environment Using Custom Functions.
Construct the custom environment using the defined observation specification, action specification, and function names.
env = rlFunctionEnv(oinfo,ActionInfo,'myStepFunction','myResetFunction');
You can create agents for
env and train them within the environment as you would for any other reinforcement learning environment.
As an alternative to using function names, you can specify the functions as function handles. For more details and an example, see Create MATLAB Environment Using Custom Functions.