Deep Q-Network Rewards incorporation?
7 views (last 30 days)
Show older comments
Zonghao zou
on 19 Sep 2020
Answered: Sabiya Hussain
on 29 Aug 2022
I have read through most of the current documentations on the Deep Q-Network in Matlab, but it is still not very clear to me how to construct a Deep Q-Network in my case.
I previously wrote my own code for implementing a simple Q-learning, for which, I constructed a Q-matrix with corresponding states and actions. I am now trying to explore how to do the same with Deep Q-Network.
The overall goal is to trying to work out a best policy for an object to move from location A to location B (assuming it is in 2-D)
I have a specific function that has all the necessary physical relationship which will return the corresponding rewards given the current state and action. (lets' say it is called the function F).
I see on the documentations: https://www.mathworks.com/help/reinforcement-learning/ref/rldqnagent.html#d122e15363, to create an agent I must create an observation and an action sets.
In my case, since I can return the specific rewards per action given current state, what should I put down as my observation? (How should I incorporate my function F into the agent?)
Also, in the documentations, I don't see anywhere it takes rewards or calculate rewards for certain actions.
Could somone help me please?
Thanks
0 Comments
Accepted Answer
Emmanouil Tzorakoleftherakis
on 24 Sep 2020
Hello,
If you have a look at this page, it shows where the reward is incorporated in a custom MATLAB environment. As you can see, the reward is included in the 'step' method, which plays the same role as your F function, so you do not have to do anything different thatn what you are doing already - you just need to create an environment object.
0 Comments
More Answers (3)
Madhav Thakker
on 23 Sep 2020
Edited: Madhav Thakker
on 23 Sep 2020
Hi Zonghao,
I understand you want to construct a Deep Q-Network. The observationInfo tells you the behaviour for your observations. In your case, you want to move an object on a grid. The observations can be the position of the object on the grid. So, your observationInfo will be rlFiniteSetSpec.
obsInfo = rlNumericSpec([2 1])
This creates observation of dimension [2,1]. If required, you can also specify upper and lower limits for your observations.
Hopet this helps.
Sabiya Hussain
on 29 Aug 2022
Hello there! I'm working on a project based on Q-learning i really need some help regarding a Markov decision process matlab program it is an example of Recycling robot i need your help
0 Comments
See Also
Categories
Find more on Policies and Value Functions in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
