rlValueRepresentation
(Not recommended) Value function critic representation for reinforcement learning agents
Since R2020a
rlValueRepresentation
is not recommended. Use rlValueFunction
instead. For more information, see rlValueRepresentation is not recommended.
Description
This object implements a value function approximator to be used as a critic within
a reinforcement learning agent. A value function is a function that maps an observation to a
scalar value. The output represents the expected total long-term reward when the agent starts
from the given observation and takes the best possible action. Value function critics
therefore only need observations (but not actions) as inputs. After you create an
rlValueRepresentation
critic, use it to create an agent relying on a value
function critic, such as an rlACAgent
, rlPGAgent
, or rlPPOAgent
. For an
example of this workflow, see Create Actor and Critic Representations. For more information on creating
representations, see Create Policies and Value Functions.
Creation
Syntax
Description
creates the value function based critic
= rlValueRepresentation(net
,observationInfo
,'Observation',obsName
)critic
from the deep neural network
net
. This syntax sets the ObservationInfo
property of critic
to the input
observationInfo
. obsName
must contain the
names of the input layers of net
.
creates the value function based critic
= rlValueRepresentation(tab
,observationInfo
)critic
with a discrete
observation space, from the value table tab
, which is an
rlTable
object
containing a column array with as many elements as the possible observations. This syntax
sets the ObservationInfo
property of critic
to the input
observationInfo
.
creates the value function based critic
= rlValueRepresentation({basisFcn
,W0
},observationInfo
)critic
using a custom basis function
as underlying approximator. The first input argument is a two-elements cell in which the
first element contains the handle basisFcn
to a custom basis
function, and the second element contains the initial weight vector
W0
. This syntax sets the ObservationInfo
property of critic
to the input
observationInfo
.
creates the value function based critic
= rlValueRepresentation(___,options
)critic
using the additional option
set options
, which is an rlRepresentationOptions
object. This syntax sets the Options
property of critic
to the
options
input argument. You can use this syntax with any of the
previous input-argument combinations.
Input Arguments
Properties
Object Functions
rlACAgent | Actor-critic (AC) reinforcement learning agent |
rlPGAgent | Policy gradient (PG) reinforcement learning agent |
rlPPOAgent | Proximal policy optimization (PPO) reinforcement learning agent |
getValue | Obtain estimated value from a critic given environment observations and actions |