1 / 35

Robotica

Lezione 10. Robotica. Lecture Outline. Neural networks Classical conditioning AHC with NNs Genetic Algorithms Classifier Systems Fuzzy learning Case-based learning Memory-based learning Explanation-based learning. Q Learning Algorithm.

Download Presentation

Robotica

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lezione 10 Robotica

  2. Lecture Outline • Neural networks • Classical conditioning • AHC with NNs • Genetic Algorithms • Classifier Systems • Fuzzy learning • Case-based learning • Memory-based learning • Explanation-based learning

  3. Q Learning Algorithm • Q(x,a)  Q(x,a) + b*(r + *E(y) - Q(x,a))‏ • x is state, a is action • b is learning rate • r is reward •  is discount factor (0,1)‏ • E(y) is the utility of the state y, computed asE(y) = max(Q(y,a)) for all actions a • Guaranteed to converge to optimal, given infinite trials

  4. Supervised Learning • Supervised learning requires the user to give the exact solution to the robot in the form of the error direction and magnitude. • The user must know the exact desired behavior for each situation. • Supervised learning involves training, which can be very slow; the user must supervise the system with numerous examples.

  5. Neural Networks • NNs are the most widely used example of supervised learning • There are numerous types of NNs (feed-forward, recurrent) and NN learning algorithms • Hebbian learning: increase synaptic strength along pathways associated with stimulus and correct response • Perceptron learning: delta rule or back-propagation

  6. Perceptron Learning • Algorithm:

  7. NNs are RL • In all NNs, the goal is to minimize the error between the network output and the desired output • This is achieved by adjusting the weights on the network connections • Note: NNs are a form of reinforcement learning • NNs perform supervised RL with immediate error feedback

  8. Classical Conditioning • Classical conditioning comes from psychology (Pavlov 1927)‏ • Assumes that unconditioned stimuli (e.g., food) cause unconditioned responses (e.g., salivation); US => UR • A conditioned stimulus is, over time, associated with an unconditioned response (CS=>UR)‏ • E.g., CS (bell ringing) => UR (salivation)‏

  9. Classical Conditioning • Instead of encoding SR rules, conditioning can be used to from the associations automatically • Can be encoded in NNs

  10. Connectionist AHC • Robuter learned a set of gain multipliers for exploration(Gachet et al)‏

  11. Associative Learning • Learning new behaviors by associating sensors and actions into rules • E.g.,: 6-legged walking (Edinburgh U.)‏ • Whisker sensors first, IR and light later • 3 actions: left,right, ahead • User provided feedback (shaping)‏ • Learned avoidance, pushing, wall following, light seeking

  12. Two-Layer Perceptron Arch.

  13. More NN Examples • Some domains and tasks lend themselves very well to supervised NN learning • The best example is robot motion planning for articulation/manipulation • The answer to any given situation is well known, and can be trained • E.g., NNs are widely used for learning inverse kinematics

  14. Evolutionary Methods • Genetic/evolutionary approaches are based on the evolutionary search metaphor • The states/situations and actions/behaviors are represented as "genes” • Different combinations are tried by various "individuals" in "populations" • Individuals with the highest "fitness" perform the best, are kept as survivors

  15. Evolutionary Methods • The others are discarded; this is the selection process. • The survivors' "genes" are mutated, crossed-over, and new individuals are so formed, which are then tested and scored. • In effect, the evolutionary process is searching through the space of solutions to find the one with the highest fitness.

  16. Summary of Evolution • Evolutionary methods solve search and optimization problems using • a fitness function • operators • They represent the solution as a genetic encoding (string)‏ • They select ‘best’ individuals for reproduction and apply • Cross over, mutation • They operate on populations

  17. Genetic Algorithms

  18. Levels of Application • Genetic methods can be applied at different levels: • 1) for tuning parameters (such as gains in a control system)‏ • 2) for developing controllers (policies) for individual robots • 3) for developing group strategies for multi-robot systems (by testing groups as populations)

  19. GAs v. GPs • When applied to strings of genes, the approaches are classified as genetic algorithms (GA)‏ • When applied to pieces of executable programs, they approaches are classified as genetic programming (GP)‏ • GP operates at a higher level of abstraction than GA

  20. Classifier Systems • Use GAs to learn rule sets • ALECSYS - Autono-mouse • Learn behaviors and coordination

  21. Evolving Structure & Behavior • Evolving morphology & behavior (Sims 1994) using a physics-based model • Fitness in simulation through competition(e.g., walking and swimming)‏

  22. Co-evolving Brains and Bodies • Extension to the physical world (Pollack)

  23. Fuzzy Logic • Fuzzy control produces actions using a set of fuzzy rules based on fuzzy logic • This involves: • fuzzifying: mapping sensor readings into a set of fuzzy inputs • fuzzy rule base: a set of IF-THEN rules • fuzzy inference: maps fuzzy sets onto other fuzzy sets using membership fncts. • defuzzifying: mapping a set of fuzzy outputs onto a set of crisp output commands

  24. Fuzzy Control • Fuzzy logic allows for specifying behaviors as fuzzy rules • Such behaviors can be smoothly blended together (e.g., Flakey, DAMN)‏ • Fuzzy rules can be learned

  25. Memory-Based Learning • Memory-based learning (MBL) involves storing the exact parameters of a situation/action. • This type of learning is important for systems with a great deal of parameters, where blind search through parameter space would take prohibitively long. • MBL takes a lot of space for data storage.

  26. Memory-Based Learning • This is a trade-off against computation time; MBL assumes that it is best to use the space to remember than time to compute. • Content-addressable memory is used to approximate output for a new input • This is a form of interpolation • The approachuses regression (basic intuition is weighted averaging)‏

  27. Symbolic Learning • Learning new rules given a set of predefined rules • by inference • by other logic techniques • Example: induction on decision trees

  28. Example: Robo-SOAR • Logical reasoning system • Production system based on implications or rules • Chunking • Conflict resolution mechanism when different rules apply • Continuous conversion of deliberative info to efficient representations

  29. Multistrategy Learning • Learning methods can be combined • Combining NNs and RL is common • use NNs to generalize states • then apply RL in a reduced state space • Combining NNs and MBL • use NNs to interpolate between instances • examples: ping pong, juggling, pole balance

  30. Multistrategy Learning • Combining NNs and genetic algorithms is also useful • example: learning to walk(next)‏ • Neuro-fuzzy is a popular approach • example: USC AFV

  31. Example: 6-legged walking • Each leg modeled as an oscillator • Phasing of legs controlled by a NN • Weights of NN learnt using a GA

  32. Example: AVATAR • USC Robotics Labs

  33. State of the Art • There are many learning techniques • Use dependant on application • Robotics is a particularly challenging domain for learning • very few successful examples • active area of research

  34. Implemented Systems • Supervised learning has been used to learn controllers for complex, many DOF arms, as well as for navigating specific paths and performing articulated skills such as juggling and pole-balancing. • Memory-based learning has been used to learn controllers for very similar manipulator tasks as above.

  35. Implemented Systems • Reinforcement learning has been used to learn controllers for individual navigating robots, groups of foraging robots, map-learning and maze-learning robots. • Evolutionary learning has been used to learn controllers for individual navigating robots, maze-solving robots, herding robots, and foraging robots.

More Related