This is a fun project, inspired by talk of richard sutton - Tutorial: Introduction to Reinforcement Learning with Function Approximation
python3 learn_mdp.py
Here the user is a reinforcement learning agent and he tries to find the optimal policy to gain maximum rewards. The environment has two states A and B. User can take 2 actions - 1,2 . Based on user's action in a state he gets positive or negative reward/feedback.
If you decide to play this game then following is the optimal policy
State | Action |
---|---|
A | 2 |
B | 1 |
This repository can be used for educational purposes. This repo can be used to explain the following concepts of Reinforcement Learning -
- MDP
- Exploration vs Exploitation Dilemma
- Introduction to RL.
Feel free to improve this project. Pull Requests are welcome.