360 likes | 422 Views
This presentation on Reinforcement Learning will help you understand the basics of reinforcement learning. You will learn the concepts like reward and states. You will also understand about actions in reinforcement learning. Finally, you'll look at a demo on Tic Tac Toe using Reinforcement Learning in Python.<br><br>Post Graduate Program in AI and Machine Learning:<br>Ranked #1 AI and Machine Learning course by TechGig<br>Fast track your career with our comprehensive Post Graduate Program in AI and Machine Learning, in partnership with Purdue University and in collaboration with IBM. This AI and machine learning certification program will prepare you for one of the worldu2019s most exciting technology frontiers. This Post Graduate Program in AI and Machine Learning covers statistics, Python, machine learning, deep learning networks, NLP, and reinforcement learning. You will build and deploy deep learning models on the cloud using AWS SageMaker, work on voice assistance devices, build Alexa skills, and gain access to GPU-enabled labs.<br><br>Key Features:<br>u2705 Purdue Alumni Association Membership<br>u2705 Industry-recognized IBM certificates for IBM courses<br>u2705 Enrollment in Simplilearnu2019s JobAssist<br>u2705 25 hands-on Projects on GPU enabled Labs<br>u2705 450 hours of Applied learning<br>u2705 Capstone Project in 3 Domains<br>u2705 Purdue Post Graduate Program Certification<br>u2705 Masterclasses from Purdue<br>u2705Get noticed by the top hiring companies<br><br>ud83dudc49Learn more at: http://bit.ly/38uySSS
E N D
What’s in it for you? • Why Reinforcement Learning? • What is Reinforcement Learning? • Supervised vs Unsupervised vs Reinforcement Learning • Important Terms • Reinforcement Learning Example • Markov’s Decision Process
Why Reinforcement Learning? Training a Machine Learning model requires a lot of data which might not always be available to us
Why Reinforcement Learning? Training a Machine Learning model requires a lot of data which might not always be available to us. Further, the data provided might not be reliable
Why Reinforcement Learning? Learning from a small subset of actions will not help expand the vast realm of solutions that may work for a particular problem Learn: To Walk
Why Reinforcement Learning? Learning from a small subset of actions will not help expand the vast realm of solutions that may work for a particular problem Learn: To Walk
Why Reinforcement Learning? This is going slow the growth that technology is capable of. Machines need to learn to perform actions by themselves and not just learn off humans Objective: Climb Mountain
What is Reinforcement Learning?
What is Reinforcement Learning? Reinforcement learning is a sub-branch of Machine Learning that trains a model to return an optimum solution for a problem by taking a sequence of decisions by itself. Consider a robot learning to go from one place to another
What is Reinforcement Learning? The robot is given a scenario and must arrive at a solution by itself. The robot can take different paths to reach the destination
What is Reinforcement Learning? It will know the best path by the time taken on each path. It might even come up a unique solution all by itself
What is Reinforcement Learning? It will know the best path by the time taken on each path. It might even come up a unique solution all by itself
Supervised vs Unsupervised vs Reinforcement Learning Data provided is labeled data, with output values specified The machine learns from its environment using rewards and errors Data provided is unlabeled data, the outputs are not specified, machine makes its own prediction Used to solve Association and clustering problems Used to solve Regression and classification problems Used to solve Reward based problems Reinforcement Learning Unsupervised Learning Supervised Learning Unlabeled data is used No predefined data is used Labeled data is used External Supervision No supervision No supervision Solves problems by understanding patterns and discovering output Solves problems by mapping labeled input to known output Follows Trail and Error problem solving approach
Important Terms in Reinforcement Learning Agent Agent is the model that is being trained via reinforcement learning
Important Terms in Reinforcement Learning Environment The training situation that the model must optimize to is called its environment Output
Important Terms in Reinforcement Learning Action All possible steps that can be taken by the model
Important Terms in Reinforcement Learning State The current position/ condition returned by the model
Important Terms in Reinforcement Learning Reward To help the model move in the right direction, it is rewarded/ points are given to it to appraise some action
Important Terms in Reinforcement Learning Policy Policy determines how an agent will behave at anytime. It acts as a mapping between Action and present State
Reinforcement Learning Example
Reinforcement Learning Example Consider a dog that we must house train Agent Environment
Reinforcement Learning Example We can get the dog to perform various actions by offering incentives such as dog biscuits as reward Action = Fetching
Reinforcement Learning Example We can get the dog to perform various actions by offering incentives such as dog biscuits as reward Reward Action = Fetching
Reinforcement Learning Example The dog will follow a policy to maximize its reward and hence, will follow every command and might even learn a new action, like begging, by itself Begging Fetching Handshake
Reinforcement Learning Example The dog will also want to run around and play and explore its environment. This quality of a model is called Exploration
Reinforcement Learning Example The dog will also want to run around and play and explore its environment. This quality of a model is called Exploration Exploring new parts of house
Reinforcement Learning Example The tendency of the dog to maximize rewards is called Exploitation. There is always a tradeoff between exploration and exploitation as exploration actions may lead to lesser rewards Action: Climbing on the sofa Reward: None
Markov’s Decision Process Markov’s Decision Process is a Reinforcement Learning policy used to map a current state to an action where the agent continuously interacts with the environment to produce new solutions and receive rewards Environment Agent Action Reward State
Markov’s Decision Process In the diagram shown, we need to find the shortest path between node A and D. Each path has a reward associated with it and the path with maximum reward is what we want to choose The nodes; A, B, C, D; denote the nodes. To travel from node to node( A to B) is an action. Reward is the cost at each path and policy is each path taken A 15 B 5 10 0 -20 C D 25
Markov’s Decision Process In the diagram shown, we need to find the shortest path between node A and D. Each path has a reward associated with it and the path with maximum reward is what we want to choose The process will maximize the output based on the reward at each step and will traverse the path with the highest reward. This process does not explore, but maximizes reward A B 5 C D 25
Join us to learn more! simplilearn.com UNITED STATES Simplilearn Solutions Pvt. Limited 201 Spear Street, Suite 1100 San Francisco, CA 94105 Phone: (415) 741-3319 INDIA Simplilearn Solutions Pvt. Limited #53/1C, 24th Main, 2nd Sector HSR Layout, Bangalore 560102 Phone: +91 8069999471 UNITED STATES Simplilearn Solutions Pvt. Limited 801 Corporate Center Drive, Suite 138 Raleigh, NC 27607 Phone: (919) 205-5565