Introductory Control Theory

Introductory Control Theory CS 659 Kris Hauser

Control Theory • The use of feedback to regulate a signal Desired signal xd Controller Control input u Signal x Plant Error e = x-xd (By convention, xd = 0) x’ = f(x,u)

What might we be interested in? • Controls engineering • Produce a policy u(x,t), given a description of the plant, that achieves good performance • Verifying theoretical properties • Convergence, stability, optimality of a given policy u(x,t)

Agenda • PID control • LTI multivariate systems & LQR control • Nonlinear control & Lyapunovfuncitons • Control is a huge topic, and we won’t dive into much detail

Model-free vsmodel-based • Two general philosophies: • Model-free: do not require a dynamics model to be provided • Model-based: do use a dynamics model during computation • Model-free methods: • Simpler • Tend to require much more manual tuning to perform well • Model-based methods: • Can achieve good performance (optimal w.r.t. some cost function) • Are more complicated to implement • Require reasonably good models (system-specific knowledge) • Calibration: build a model using measurements before behaving • Adaptive control: “learn” parameters of the model online from sensors

PID control • Proportional-Integral-Derivative controller • A workhorse of 1D control systems • Model-free

Proportional term Gain • u(t) = -Kp x(t) • Negative sign assumes control acts in the same direction as x x t

Integral term Integral gain • u(t) = -Kp x(t) - KiI(t) • I(t) = (accumulation of errors) x t Residual steady-state errors driven asymptotically to 0

Instability • For a 2nd order system (momentum), P control Divergence x t

Derivative term Derivative gain • u(t) = -Kp x(t) – Kd x’(t) x

Putting it all together • u(t) = -Kp x(t) - KiI(t) - Kd x’(t) • I(t) =

Parameter tuning

Example: Damped Harmonic Oscillator • Second order time invariant linear system, PID controller • x’’(t) = A x(t) + B x’(t) + C + D u(x,x’,t) • For what starting conditions, gains is this stable and convergent?

Stability and Convergence • System is stable if errors stay bounded • System is convergent if errors -> 0

Example: Damped Harmonic Oscillator • x’’ = A x + B x’ + C + D u(x,x’) • PID controller u = -Kp x –Kd x’ – Ki I • x’’ = (A-DKp) x + (B-DKd) x’ + C - D Ki I

Homogenous solution • Instable if A-DKp > 0 • Natural frequency w0 = sqrt(DKp-A) • Damping ratio z=(DKd-B)/2w0 • If z > 1, overdamped • If z < 1, underdamped (oscillates)

Example: Trajectory following • Say a trajectory xdes(t) has been designed • E.g., a rocket’s ascent, a steering path for a car, a plane’s landing • Apply PID control • u(t) = Kp (xdes(t)-x(t)) - KiI(t) + Kd(x’des(t)-x’(t)) • I(t) = • The designer of xdes needs to be knowledgeable about the controller’s behavior! x(t) xdes(t) x(t)

Controller Tuning Workflow • Hypothesize a control policy • Analysis: • Assume a model • Assume disturbances to be handled • Test performance either through mathematical analysis, or through simulation • Go back and redesign control policy • Mathematical techniques give you more insight to improve redesign, but require more work

Multivariate Systems • x’ = f(x,u) • x X  Rn • u U  Rm • Because m  n, and variables are coupled, this is not as easy as setting n PID controllers

Linear Time-Invariant Systems • Linear: x’ = f(x,u,t) = A(t)x + B(t)u • LTI: x’ = f(x,u) = Ax + Bu • Nonlinear systems can sometimes be approximated by linearization

Convergence of LTI systems • x’ = A x + B u • Let u = - K x • Then x’ = (A-BK) x • The eigenvaluesli of (A-BK) determine convergence • Each li may be complex • Must have real component between (-∞,0]

Linear Quadratic Regulator • x’ = Ax + Bu • Objective: minimize quadratic cost xTQ x + uTR u dtOver an infinite horizon Error term “Effort” penalization

Closed form LQR solution • Closed form solutionu = -K x, with K = R-1BP • Where P is a symmetric matrix that solves the Riccati equation • ATP + PA – PBR-1BTP + Q = 0 • Derivation: calculus of variations • Packages available for finding solution

Nonlinear Control • General case: x’ = f(x,u) • Two questions: • Analysis: How to prove convergence and stability for a given u(x)? • Synthesis: How to find u(t) to optimize some cost function?

Toy Nonlinear Systems Cart-pole Acrobot Mountain car

Proving convergence & stability with Lyapunov functions • Let u = u(x) • Then x’ = f(x,u) = g(x) • Conjecture a Lyapunov function V(x) • V(x) = 0 at origin x=0 • V(x) > 0 for all x in a neighborhood of origin V(x)

Proving stability with Lyapunov functions • Idea: prove that d/dt V(x)  0 under the dynamics x’ = g(x) around origin V(x) t g(x) t d/dt V(x)

Proving convergence with Lyapunov functions • Idea: prove that d/dt V(x) < 0 under the dynamics x’ = g(x) around origin V(x) t g(x) t d/dt V(x)

Proving convergence with Lyapunov functions • d/dt V(x) = dV/dx(x) dx/dt(x) = V(x)T g(x) < 0 V(x) t g(x) t d/dt V(x)

How does one construct a suitable Lyapunov function? • Typically some form of energy (e.g., KE + PE) • Some art involved

Direct policy synthesis: Optimal control • Input: cost function J(x), estimated dynamics f(x,u), finite state/control spaces X, U • Two basic classes: • Trajectory optimization: Hypothesize control sequence u(t), simulate to get x(t), perform optimization to improve u(t), repeat. • Output: optimal trajectory u(t) (in practice, only a locally optimal solution is found) • Dynamic programming: Discretize state and control spaces, form a discrete search problem, and solve it. • Output: Optimal policy u(x) across all of X

Discrete Search example • Split X, U into cells x1,…,xn, u1,…,um • Build transition function xj= f(xi,uk)dt for all i,k • State machine with costs dt J(xi) for staying in state I • Find u(xi) that minimizes sumof total costs. • Value iteration: repeateddynamic programming overV(xi) = sum of total futurecosts Value function for 1-joint acrobot

Receding Horizon Control (aka model predictive control) ... horizon h horizon 1

Controller Hooks in RobotSim • Given a loaded WorldModel • sim= Simulator(world) • c = sim.getController(0) • By default, a trajectory queue, PID controller • c.setMilestone(qdes) – moves smoothly to qdes • c.addMilestone(q1), c.addMilestone(q2), … – appends a list of milestones and smoothly interpolates between them. • Can override behavior to get a manual control loop. At every time step, do: • Readq,dq with c.getSensedConfig(), c.getSensedVelocity() • For torque commands: • Compute u(q,dq,t) • Send torque command via c.setTorque(u) • OR for PID commands: • Compute qdes(q,dq,t), dqdes(q,dq,t) • Send PID command via c.setPIDCommand(qdes,dqdes)

Next class • Motion planning • Principles Ch 2, 5.1, 6.1

Introductory Control Theory

Introductory Control Theory

Presentation Transcript

IVR: Control Theory

Control Theory (2)

Generalist Practice and Introductory Theory

SUPERVISORY CONTROL THEORY

Introductory Control Theory

Motor Control Theory

Control Theory

Control Theory

Control Theory

Control Theory

Control Theory

Automatic Control Theory

Control Theory

Automatic Control Theory

Control Theory

Control Theory

Control Theory

Control Theory (“Regeltechniek”)

Optimal Control Theory

CONTROL THEORY

Generalist Practice and Introductory Theory