Tic-Tac-Toe Reinforcement Learning against adversary agent
Adversary policy is simplistic at iteration 1, i.e. adversary plays picks at random any cell not yet marked on TTT board. RL learns against this policy of the adversary in iteration 1 (by performing Q-Learning) and then hands off this deterministic policy to adversary so that adversary plays 'better' in iteration 2 than its earlier random policy. This process repeats until RL's handoff to adversary hasn't changed over iterations - which occurs at iteration 7. Program written as demonstration to Artificial Intelligence and Machine Learning eMDP program 2020-21 at IIM Kozhikode, with assistance from Afsal Najeeb <afsaln.india@gmail.com>
Cite As
Shahid Abdulla (2025). Tic-Tac-Toe Reinforcement Learning against adversary agent (https://uk.mathworks.com/matlabcentral/fileexchange/84672-tic-tac-toe-reinforcement-learning-against-adversary-agent), MATLAB Central File Exchange. Retrieved .
MATLAB Release Compatibility
Platform Compatibility
Windows macOS LinuxTags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Discover Live Editor
Create scripts with code, output, and formatted text in a single executable document.
| Version | Published | Release Notes | |
|---|---|---|---|
| 1.0.0 | 
