Deep Q-Network Rewards incorporation?
2 views (last 30 days)
Show older comments
Zonghao zou
on 19 Sep 2020
Answered: Sabiya Hussain
on 29 Aug 2022
I have read through most of the current documentations on the Deep Q-Network in Matlab, but it is still not very clear to me how to construct a Deep Q-Network in my case.
I previously wrote my own code for implementing a simple Q-learning, for which, I constructed a Q-matrix with corresponding states and actions. I am now trying to explore how to do the same with Deep Q-Network.
The overall goal is to trying to work out a best policy for an object to move from location A to location B (assuming it is in 2-D)
I have a specific function that has all the necessary physical relationship which will return the corresponding rewards given the current state and action. (lets' say it is called the function F).
I see on the documentations: https://www.mathworks.com/help/reinforcement-learning/ref/rldqnagent.html#d122e15363, to create an agent I must create an observation and an action sets.
In my case, since I can return the specific rewards per action given current state, what should I put down as my observation? (How should I incorporate my function F into the agent?)
Also, in the documentations, I don't see anywhere it takes rewards or calculate rewards for certain actions.
Could somone help me please?
Thanks
0 Comments
Accepted Answer
Emmanouil Tzorakoleftherakis
on 24 Sep 2020
Hello,
If you have a look at this page, it shows where the reward is incorporated in a custom MATLAB environment. As you can see, the reward is included in the 'step' method, which plays the same role as your F function, so you do not have to do anything different thatn what you are doing already - you just need to create an environment object.
0 Comments
More Answers (3)
Madhav Thakker
on 23 Sep 2020
Edited: Madhav Thakker
on 23 Sep 2020
Hi Zonghao,
I understand you want to construct a Deep Q-Network. The observationInfo tells you the behaviour for your observations. In your case, you want to move an object on a grid. The observations can be the position of the object on the grid. So, your observationInfo will be rlFiniteSetSpec.
obsInfo = rlNumericSpec([2 1])
This creates observation of dimension [2,1]. If required, you can also specify upper and lower limits for your observations.
Hopet this helps.
Sabiya Hussain
on 29 Aug 2022
Hello there! I'm working on a project based on Q-learning i really need some help regarding a Markov decision process matlab program it is an example of Recycling robot i need your help
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!