Linear Quadratic Regulator in 3 Ways
Slide: Explore the Linear Quadratic Regulator (LQR) through the Indirect Shooting Method (Pontryagin's Principle), the Optimization Approach (Quadratic Programming), and the Recursive Solution (Riccati Equation). Inspired by Zachary Manchester's Spring 2024-25 lecture "Optimal Control and Reinforcement Learning".
Introduction
Problem Formulation
Consider a discrete-time linear system:
Quadratic cost function:
Assumptions:
is controllable and is observable
Indirect Shooting: PMP Perspective
Problem Formulation and Optimality Conditions
Consider the deterministic discrete-time optimal control problem:
The first-order necessary conditions for optimality can be derived using:
- The Lagrangian framework (special case of KKT conditions)
- Pontryagin's Minimum Principle (PMP)
Lagrangian Formulation
Form the Lagrangian:
Define the Hamiltonian:
Rewrite the Lagrangian using the Hamiltonian:
Optimality Conditions
Take derivatives with respect to
For
Summary of Necessary Conditions
The first-order necessary conditions can be summarized as:
In continuous time, these become:
Application to LQR Problems
For LQR problems with quadratic cost and linear dynamics:
The necessary conditions simplify to:
This forms a linear two-point boundary value problem.
Indirect Shooting Algorithm for LQR
Procedure:
- Make initial guess for control sequence
- Forward pass: Simulate dynamics to get state trajectory
- Backward pass:
- Set terminal costate:
- Compute costate trajectory:
- Compute control adjustment:
- Set terminal costate:
- Line search: Update controls
- Iterate until convergence
Direct Approach: QP Perspective
LQR as Quadratic Programming Problem
Assume
The dynamics constraints can be expressed as
QP Formulation and KKT Conditions
The LQR problem becomes the QP:
The Lagrangian of this QP is:
The KKT conditions are:
This leads to the linear system:
We get the exact solution by solving one linear system!
Riccati Equation Solution
KKT System Structure for LQR
The KKT system for LQR has a highly structured sparse form, consider an
Deriving the Riccati Recursion
Start from the terminal condition (blue equation):
Move to the previous equation (red equation):
Substitute
Solve for
Deriving the Riccati Recursion (Cont'd)
Now consider the green equation:
Substitute
Substitute
Solve for
Riccati Recursion Formula
We now have a recursive relationship. Generalizing:
This is the celebrated Riccati equation.
The solution process involves:
- A backward Riccati pass to compute
and for - A forward rollout to compute
and using
Computational Complexity
Naive QP Solution: Treats problem as one big least-squares.
- Computational cost:
- Must be re-solved from scratch for any change.
Riccati Recursion: Exploits the temporal structure.
- Computational cost:
- Exponentially faster for long horizons (large
).
The Riccati Solution is More Than Just Fast:
- It provides a ready-to-use feedback policy:
- This policy is adaptive: optimal for any initial state
, not just a single one. - It enables real-time control by naturally rejecting disturbances.
- And it delivers the exact same optimal solution as the QP.
Conclusion
Summary
Finite-Horizon Problems
- Use Riccati recursion backward in time
- Store gain matrices
- Apply time-varying feedback
Infinite-Horizon Problems
- Solve algebraic Riccati equation offline
- Use constant gain matrix
- Implement simple state feedback
- Algebraic Riccati Equation (ARE):