1 / 24

Reinforcement Learning

Reinforcement Learning. Developing a self-learning snake game using Reinforcement Learning and pygame. About me. Student, Pursuing my Bachelor’s in Software Engineering Freelance Software Developer A FOSS enthusiast, currently contributing to coala

bwomack
Download Presentation

Reinforcement Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reinforcement Learning Developing a self-learning snake game using Reinforcement Learning and pygame.

  2. About me • Student, Pursuing my Bachelor’s in Software Engineering • Freelance Software Developer • A FOSS enthusiast, currently contributing to coala • Pythonista, loves to develop automation projects, Machine Learning projects and occasionally write blogs regarding python. Github: https://github.com/satwikkansal Linkedin: https://linkedin.com/in/satwikkansal Website: http://www.satwikkansal.xyz Blog: https://satwikkansal.wordpress.com

  3. Do you remember these?

  4. Contents • Quick Intro to Game Development : Common concepts • Designing the gameplay • Events and control, Implementing game logic • Some RL concepts: Agent, State, Reward, Policy, MDP and few more. • Q-Learning to the Rescue • Other Reinforcement Learning Techniques • Self-Driving Car in action • Current applications and Future Scopes in RL • Available open source framework and libraries The code for the workshop is available at https://github.com/satwikkansal/snakepy

  5. Some Game Development concepts • Coordinates : The screen is a 2D grid plane with (0,0) in the top left • Colors: RGB and alpha values • Drawing: Plotting pixels, Surface Object, blitting • Rendering: Animation, Frame/Refresh rate • The game loop:

  6. Designing the Gameplay Objects : A snake, Apples, Walls Snake eats the apples, grows 1 unit longer. Snake dies when it hits the wall or runs over itself. Objective: Eat as many apples as possible without dying. • What happens when the snake gets killed? • How to start the game?

  7. Code Implementation: Drawing, Displaying and Moving the game objects.

  8. User Interaction & Game Logic • Arrow keys to move the head. • Do we want our snake to keep moving. • Detecting overlaps and collisions of snake head with other objects : boundaries, apples and its body. • Scoring

  9. Code Implementation: Adding the controls and the score to make a fully functional snake game.

  10. Okay, let’s make our dumb computer control the snake.

  11. Code Implementation: Wait, let’s add some intelligence to our agent. (Provide vision to the CPU i.e. game rules) Next Section: Or better, let’s make the CPU discover knowledge. (Make our snake learn from experiences)

  12. Time to introduce Reinforcement Learning!

  13. A few things to know • State, History and Episode • Action • Reward • Policy, value function, and model • Environment • Agent • Markov states and MDP Long story short : Everything that surrounds the agent in environment. A state represents the situation of the agent at a particular time in the environment. The agent performs an action to transition from one state to another and may receive a reward in return. The policy is the strategy of choosing an action given a state and the agent tries to chose a policy that optimizes the expected cumulative reward.

  14. Implementation: Refactoring the game’s code

  15. Q-learning to the rescue! • Popular, Simple, Model free RL technique (Environment’s model is not required) • Can find optimal action-selection policy for any finite MDP. • Learns the action-value function

  16. Code Implementation: Using Q-learning to choose actions for the agent.

  17. Our agent in action Note: Currently our rules don’t penalize snake for running over itself.

  18. Possible Improvements to our agent • Optimizing the state space • Adding time-based rewards • Minimizing the exploration v/s exploitation tradeoff • Optimizing the hyperparameters using techniques like Grid Search, Genetic Algorithms. • Using state of the art RL techniques.

  19. Other interesting techniques SARSA: Uses Q-Learning as a part of policy iteration mechanism, next action is chosen randomly with predefined probability, faster than Q-learning when no. of actions are high. Deep Q-Networks: Combines usage of RL and Deep Neural Networks like CNN. Learns the non-linear value-action function through experience replay.

  20. The self-driving car simulation design State: • Car on left, right, ahead? • Traffic light green or red? • Next waypoint (from GPS) Actions: • Steer Left, Steer Right • Accelerate, brake Rewards: • Violating the traffic laws • Hitting the obstacles • Reaching the destination • Time taken to reach destination (any thoughts on this?) Code Sample available at: https://github.com/satwikkansal/smartcab

  21. Applications of Reinforcement Learning • Playing games like chess (reward is not instantaneous, delayed feedback) • Managing portfolio and finances (reward here is the money) • Robotics (humanoid robots) • Manufacturing and inventory management. • General AI agents: Agents that can perform multiple things with single algorithm. Example, an agent playing all the Atari games.

  22. Open source frameworks and libraries for RL Open AI gym - A toolkit for developing and comparing reinforcement learning algorithms. Open AI universe - A software platform for measuring and training an AI's general intelligence across the world's supply of games, websites and other applications Deepmind Lab - A customisable 3D platform for agent-based AI research

  23. Some nice links Youtube lectures and tutorials: • UCL course on RL by D.Silver - http://bit.ly/RL-UCL • Sentdex pygame tutorial - http://bit.ly/sentdex-pygame Python Code Samples: • Reinforcement Learning, an introduction - http://bit.ly/RL-intro-Python Online Demo: • ConvNetJS - http://bit.ly/convnetjs

More Related