410 likes | 606 Views
Motor Adaptation as a Process of Reoptimization. Jun Izawa , Tushar Rane , Opher Donchin , Reza Shadmehr. Motor Adaptation.
E N D
Motor Adaptation as a Process of Reoptimization JunIzawa, TusharRane , OpherDonchin , RezaShadmehr
Motor Adaptation • Adaptation : a process in which the nervous system learns to predict and cancel effects of a novel environment,returning movements to near baseline (unperturbed) conditions. • Define a task : Move your hand from A to B whilethere is a forcefield in perpendiculardirection.
Reaching Task • Without any disturbance, movementsstay on a straightline ‘BaselineConditions’ • Error : difference betweenthe observed trajectory on that trial and average behavior inbaselinecondition. A B A B Perturbationdirection
Optimal Movements? • Arethebaseline movements are somehowthe optimal movements in all conditions ? • It is shownthatpeople adapt to the noveldynamics by reaching along thenewlydefined optimal path in thatcondition. • Thepurpose of ourmovements is : • Toacquire rewardingstates (such as reaching the end point accurately) • at a minimumcost
More on Optimality • Each environment has its own cost and reward structure. • The trajectory or feedback response that was optimum inone environment is unlikely to remain optimum in the new environment. • With the help of recent advances in optimal control theory, some theoretical predictions about whatthe adapted trajectory should look like in each field can be made.
Framework • Inthisframework, the problem is to maximize performance. To do so: • 1.) Build a model of the novelenvironment so one can accurately predict the sensory consequencesof motor commands. • 2. ) Usethis internal model to find the best movement plan. τ ? q
StochasticBehaviours • Stochastic behavior of an environment : uncertaintyin the internal model • Controller (maximizingperformance) took this uncertainty into account as itgeneratedmovementplans. • uncertain about its massLifting and drinking from thecup will tend to be slower, particularlyas it reaches our mouth
AIM • In the force-field task, introduce uncertainty bymaking the environment stochastic. • To maximize performance(reach to the target in time), the theory takes intoaccount this uncertainty and reoptimizes the reach plan. • Question : Adaptation proceeded by • returning trajectoriestoward a baseline • aprocess of reoptimization ?
ExperimentSetup • Volunteers sat on a chair in front of a robotic arm and helditshandle. • Simplevisualfeedback. • A single direction movements (90°,straight a way from the body along a line perpendicular to the frontalplane), along the midline of the subject’s body
Experiment #1 • Subjects (n=28) trained for 3 consecutive days, practicing reaching in a single direction. • Rewarded (via a target explosion) if they completedtheir movement in 450 ± 50 ms. • Timing feedback : blue-colored target slowermovement • After completion of the movement, the robot pulled the hand back to thestartingposition 9 mm start target
Experiment #1 Groups • Non-zeromeanforperturbationforces • Groups 3 and 4 experienced an additional block of50 movements at theend of the third daywithzerovariancefortestingpurposes
Experiment #2 • A fieldthat had a zero mean but a non-zero variance. • Samemovementdirection • Subjects (n=18) were instructed to complete theirreach within 600 ± 50 ms. • A score indicating the number of successfultrials was displayed on the screen. • To encourage performance, we paidthe subjects based on their score. 18 mm start target
Experiment #2 TrialSets • 8 Blocks of 150 trials in a singleday. • ZeroMean
Experiment #3 • A reaching task through a via-point. • Targets exploded at the completionof the movementifsuccesful. • Otherwise, at the completion of the movement,the subject was provided timing feedback for the via-point with arrowsand for the final target by color. • No timing feedback duringthe movement. Via-pointtarget 9 mm 9 mm start target 1.0 s 400 ± 50 ms
Experiment #3 TrialSets • 8 Blocks of 150 trials in a singleday. • ZeroMean
ModelingandSimulations • Stochastic optimal feedbackcontrol (OFC) to model reaching. • 3 components: • Noise is signaldependent with an SD that grows with the size of the motor commands • reacts to those sensory consequences Optimal Controller Internal model Plant / Environment motor commands sensory consequencespredictions
Novelity in the Sense of Derivation • Extension of Todorov’sMethod • Novel only in the sense that it considers the problemof optimal control in the context of uncertainty about the internal model. • Model uncertainty a dual to theproblem of control with signal-dependent noise.
Mathematical Model • A deterministicdynamicsystem model • A stochastic model : V is a Gaussian random variable with mean zero and variance Qv • One can see that the uncertainty in parameter A is state-dependent noise.
Mathematical Model • Gaussian random variables with mean 0 andvariance 1 representing the additive and multiplicative state variabilityand the measurement noise. • Ci is the scaling matrices
Mathematical Model • xt: actual state of the system that is not available to the controller. • xˆt : estimationvia Kalman filter. • The optimal control policy is of thefollowing form: • Optimal control provides closed-form solutions for only linear dynamicalsystems (modeled the arm as a point mass in Cartesian coordinates)
SimulationResults Fullyobservableforce D=1 • Previouswork suggests that with training, subjects acquire a model accuracy of 0.8. baseline • If the learner’s modelpredicted ≤100% of the effects of the field, then the combinedeffect of bothsystems is morecomplicated • actualmodeled
Forces of the optimal controller is compared to forces that must beproduced if the mass is to move along astraight-line, minimum-jerktrajectory. • Less total forcebyovercompensatingearlyinto the movement, when speeds were small • Howshouldplanningchange when one is uncertain of the strength of the force field? • overcompensation in thecontrollerwas a function of its uncertainty • as field variance increased, the optimum plan no longerhad a bell-shaped speed profile
SimulationSummary • In summary, theory predicted that: • Ina deterministic field : • Optimal trajectory is a slightly curved hand path thatappears to overcompensate for the forces. • In a stochastic field : • theoptimum trajectory loses its overcompensation tendencies, • peakvelocities become larger • the timing of the peak shifts earlierin timeallowing the hand to approach the target more slowly
Experiment #1, Groups 1 and 2 • Overcompensateat start and undercompensated at the target, (S-shapedpaths) • Maintainedbehaviorfor the duration of the 3 day experiment • By the end of training on day 1, the success rates had reachedlevels observed in the null conditionand ↑ with more days of training • ↑ overcompensation ↑performance
Experiment #1, Group 3 • Theory had predicted that overcompensation should disappear • There was one additional data point for thelast block on the last day of training • Another prediction of the theory was that hand trajectories should show larger errors in the direction of theforce perturbations
Experiment #1, Group 3 • As overcompensation declined, performance improved • Stochasticanddeterministicgroups would show different speed profiles • Theory had predicted that velocity peaks should rise as the field acquires stochasticproperties • By day 3, group displayed a hand speed thatwas skewed and had a higher peak value
Experiment #1, Group 4 • The same tendencies were observed when subjects weretrained in a stochastic CCW field • Overcompensationdisappeared • The measure A1–A2 became more negative • Performanceratesimproved • Peakspeedsincreased • .
Experiment #1, Group 4 • Peopleresponsebyeliminating their overcompensation and producing an increasing peak speed that skewed the profile andslowed the hand as it approached the target
Validity of Experiment #1? • The reduced overcompensation in the high-variance field wasconsistent with the optimal policy, other reason: if the field is more variable, people maylearn it less well • Although this would notexplain the increased speed in thehighervariations group, it can account for thereduced overcompensation. • In experiment 2, the task was to reach to a target at 18 cm. • Zero-mean with zero, small, orlargevariance any differences seen regardingcontrol policies should be attributable to the variance of thefield and not a bias in learning of the mean
Experiment #2 • Thehand paths straight in the null condition. • Withincreased variance, a tendency to curve to the right of the null • unpredictabilityof the environment : gradually increasing peakspeeds and skewing the speed profile to reduce speed neartarget
Motivation of Experiment 3 • The theory explained that increased uncertainty cautious as the movement approaches the target. • If one has a target in the middle of a movement, thenincreased uncertainty should make the movement through thattarget change as uncertainty increases. • Zero mean field with zero/nonzerovariance. Via-pointtarget 9 mm 9 mm start target 1.0 s 400 ± 50 ms
Experiment #3 • Optimal trajectory in a low-uncertainty fieldshould be a straight line with a bell-shaped velocity profile. • Asfield uncertainty increased, the movement should slow down as itapproaches the via-point, effectively producing a segmentedmovement.
DiscussionKey-Points • If the purpose of a movement is to acquire a rewarding state at aminimum cost, dynamics should be taken into account. • OFC theory when the environment changes, the learner performs two computationssimultaneously: • (1) finds a more accurate model ofhow motor commands produce changes in sensory states • (2) uses that model to find a better movement plan that reoptimizesperformance
DiscussionKey-Points • Previousstudiesshowthatwith adaptation, hand paths curved out beyond the baselinetrajectory, suggesting an overcompensation in the forces that subjects produce. • Experimentsthat measured hand forces in channeltrials found that the maximum force was, at most, 80% of thefield • OFC explained that both the curved hand pathand the under compensation of the peak force were signatures ofminimizing total motor costs of the reach
DiscussionKey-Points • When dynamics were stochastic : if we do not know theamount of coffee in a cup (or how hot it may be), we lift it anddrink from it differently than if we are certain of its contents. • If the force field is stochastic,theory predicted that overcompensation should disappearand peak speeds should increaseas observed.
DiscussionKey-Points • With zero mean and non-zero variance: theory predicted that as field variance ↑, the trajectory shifts from a bell-shaped speed profile to one with a larger peakearlier in the movement, allowing more time to control the limbnear the target. (as observed) • Via-point : As field variance ↑, theorypredicted that one should slow down as the hand approaches thevia-point. That is, movements should exhibit a single peak in theirspeed profiles when field variance was zero, but multiple peaks as the field variance increased (as observed)
DiscussionKey-Points • The idea that movement planning should depend on the dynamicsof the task seems intuitive. • For example, Fitts (1954) observedthat changing the weight of a pen affected how peopleplanned their reaching movements: to maintain accuracy, peoplemoved the heavier pen more slowly than the lighter pen. • Uncertaintyabout the magnitude of a velocitydependentfield ==velocity-dependent noise. • Withnoise, minimize speed at the task-relevantareas: at the via-point and at the end point • Withtraining, people learn a forwardmodel of the task and then use that model to form a better movement plan
Limitations • There are significant limitations in applyingthe theory to biological movements. • Theory faces significant hurdles when there are multiplefeedback loops in the biological motor control system • namely the state-dependent response of muscles, spinal reflexpathways, as well as the long-loop pathways. • In a more realisticsetting, it is unclear what is meant by a motor command and amotor cost. • Bestclaim is thatexperimental results are difficult to explain with the notion ofan invariant desired trajectory, but qualitatively in agreementwiththetheory.