Tic-Tac-Toe Reinforcement Learning against adversary agent

RL to learn TTT against an adversary agent, where this adversary agent is handed off the learnt optimal policy of previous round.
131 Downloads
Updated 23 Dec 2020

View License

Adversary policy is simplistic at iteration 1, i.e. adversary plays picks at random any cell not yet marked on TTT board. RL learns against this policy of the adversary in iteration 1 (by performing Q-Learning) and then hands off this deterministic policy to adversary so that adversary plays 'better' in iteration 2 than its earlier random policy. This process repeats until RL's handoff to adversary hasn't changed over iterations - which occurs at iteration 7. Program written as demonstration to Artificial Intelligence and Machine Learning eMDP program 2020-21 at IIM Kozhikode, with assistance from Afsal Najeeb <afsaln.india@gmail.com>

Cite As

Shahid Abdulla (2025). Tic-Tac-Toe Reinforcement Learning against adversary agent (https://uk.mathworks.com/matlabcentral/fileexchange/84672-tic-tac-toe-reinforcement-learning-against-adversary-agent), MATLAB Central File Exchange. Retrieved .

MATLAB Release Compatibility
Created with R2020b
Compatible with any release
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Version Published Release Notes
1.0.0