Solving OpenAI Gym Cartpole with Q-learning

Q-learning agent learns to balance the pole connected to a cart

January 28, 2017 - 1 minute read -
machine-learning reinforcement-learning

The goal of cartpole task is balancing the cart to prevent the pole from falling down. It is one of the mostly experimented environments from OpenAI Gym. My implementation of q-learning solved cartpole in 1598 training steps. I am happy that it worked even though I haven’t tuned the hyperparameters too much :)

Demo

Training

The below plot shows the 100 episodes average rewards got from on-policy training. x-axis represents the training episodes. The reward is along the y-axis.

rewards

Code

Algorithm

See my post, Learing RL by Coding

  • OpenAI Gym Submission Page
  • Github repo