Answered
Epsilon greedy algorithm and environment reset do not work during DQN agent training
Hello, Here are some comments: 1.The reset function should not produce the same output. You should first doublecheck the reset...

5 years ago | 0

| accepted

Answered
Mix of static and dynamic actions for a Reinforcement Learning episode
Hello, I am not sure the approach you mention would work, since even if you constrain the constant action, the agent will still...

5 years ago | 0

Answered
Problems in using Reinforcement Learning Agent
Hello, I am assuming you have seen this example already? Seems similar. I don't see the script where you set up DDPG but there ...

5 years ago | 0

Answered
How to simulate saved agents?
Hello, Righ-click on the 'Agents' folder from within MATLAB and add it to path (or use addpath). Then load Agent1.mat xpr = s...

5 years ago | 0

Answered
Simulating environment while Training rlAgent
I can interpret your question in 3 ways so I will put my thoughts here and hopefully they will be sufficient. 1) Depending on t...

5 years ago | 0

Answered
How do I specify multiple, heterogeneous actions for the rl.env.MATLABEnvironment of the Reinforcement learning toolbox or another way, if there is one?
Hello, I probably don't understand your objective but the two actions you mention above (distance and speed) are still scalars....

5 years ago | 1

| accepted

Answered
How can I create a simulation environment with reinforcement learning?
rlFunctionEnv can be used to create an environment where the dynamics are in a MATLAB function. You could also create an environ...

5 years ago | 0

| accepted

Answered
Reinforcement Learning Grid World multi-figures
Hello, I wouldn't worry about the spikes as long as the average reward has converged. Could be the agent exploring something. ...

5 years ago | 0

| accepted

Answered
Reinforcement Learning Toolbox Example Ball Balancing
Hello, This example was created as part of a tutorial for IROS 2020. The example files are here, in folder #3. Hope this helps...

5 years ago | 1

| accepted

Answered
How can I use the different Target Smooth Factor in actor and critic network? (Reinforcement Learning Toolbox)
Hello, This is not currently possible, but I have let the development team know and they will look into it. Thanks for bringin...

5 years ago | 0

| accepted

Answered
Episode Q0 increases exponentially
Hello, Please take a look at this answer for some suggestions. Normalizing observations, rewards, and actions can also help avo...

5 years ago | 0

Answered
How GAE calculates in Reinforement Learning Toolbox(PPO)?
Hello, Thank you for catching this typo - it should be Gt = Dt+V. I have let the documentation team know.

5 years ago | 0

| accepted

Answered
Get observation of final episode of RL agent
Do you want to save the observations in the last time step of the final episode? Or all the observations shown in the final epis...

5 years ago | 1

| accepted

Answered
Problem using scalingLayer for shifting actor outputs to desired range
If you remove ',...' from the 'tanh' row the error goes away. The way you have it now, you are adding the scaling layer in the s...

5 years ago | 0

| accepted

Answered
Prediction of NOx emissions by using the cylinderpressure curves of an internal combustion engine
This sounds like a supervised learning problem with time dependencies in which case I would recommend working with LSTMs and Dee...

5 years ago | 0

Answered
RL toolbox train on continuous simulation with delay between episodes
Hi Joe, I believe the setup you mention may be possible but it will require some work.Essentially, you need to set up training ...

5 years ago | 0

Answered
Hybrid reinforcement learning and traditional control environment
Hello, You can put the RL Agent block in an enabled subsystem and use the desired condition to indicate when to use RL and when...

5 years ago | 0

Answered
Reinforcement Learning : MaxSumWordLength is 65535 bits and a minimum word length of 65536 bits is necessary so that this sum or difference can be computed with no loss of precision - ( 'rl.simulink.blocks.AgentWrapper' )
It's a bit hard to find the cause of the error without a reproduction model, but based on the error you are seeing, I would chec...

5 years ago | 1

Answered
RL Toolbox: DQN epsilon greedy exploration with epsilon=1 does not act random
Hello, Maybe I misread the question, but you are saying "when starting the Simulation and watching the output of the episodes.....

5 years ago | 0

| accepted

Answered
How to get the history action value of reinforcement learning agent
Not exactly sure what you mean. During training the RL algorithms are already doing inference. You can use getAction and getValu...

5 years ago | 0

| accepted

Answered
How to train a Reinforcement Learning agent from 0.02s of simulation
I believe you can put the RL Agent block in an enabled subsystem and set the enable time to be 0.02 seconds. Hope that helps

5 years ago | 0

| accepted

Answered
Why my RL training abruptly stopping before the total EpisodeCount?
Please take a look at this doc page. While you are selecting "episodecount" as the termination criterion, you don't set the stop...

5 years ago | 0

| accepted

Answered
Saving simulation data during training process of RL agents
Have you tried logging the data with Simulation Data Inspector? Make sure to pick only signals you actually need since depending...

5 years ago | 0

Answered
How to use RNN+DDPG together?
The ability to create LSTM policies with DDPG is available starting in R2021a. Hope that helps

5 years ago | 1

| accepted

Answered
Customized Action Selection in RL DQN
Hello, I believe this is not possible yet. A potential workaround (although not state dependent) would be to emulate a pdf by p...

5 years ago | 0

Answered
How to save and use the pre-trained DQN agent in the reinforcement learning tool box
Hello, Take a look at this example, and specifically the code snippet below: if doTraining % Train the agent. tr...

5 years ago | 1

Answered
Save data in Deep RL in the Simulink environment for all episodes
Hello, You can always select the signals you want to log and view them later in Simulation Data Inspector. Same goes for the re...

5 years ago | 1

| accepted

Answered
How to compute the gradient of deep Actor network in DRL (with regard to all of its parameters)?
In the link you provide above, the gradients are calculated with the "gradient" function that uses automatic differentiation. So...

5 years ago | 0

Answered
MPC: step response starts with unwanted negative swing when using previewing
It appears that the optimization thinks that moving in the opposite direction first is "optimal". You can change that by adding ...

5 years ago | 0

| accepted

Answered
RL: Continuous action space, but within a desired range
Hello, There are two ways to enforce this: 1) Using the upper and lower limits in rlNumericSpec when you are creating the acti...

5 years ago | 0