Linear Quadratic Regulator in 3 Ways

Slide: Explore the Linear Quadratic Regulator (LQR) through the Indirect Shooting Method (Pontryagin's Principle), the Optimization Approach (Quadratic Programming), and the Recursive Solution (Riccati Equation). Inspired by Zachary Manchester's Spring 2024-25 lecture "Optimal Control and Reinforcement Learning".

Introduction

Problem Formulation

Consider a discrete-time linear system:

x_{n + 1} = A_{n} x_{n} + B_{n} u_{n}

Quadratic cost function:

min_{x_{1 : N}, u_{1 : N - 1}} J = \sum_{n = 1}^{N - 1} \underset{running cost}{\underset{⏟}{[\frac{1}{2} x_{n}^{⊤} Q_{n} x_{n} + \frac{1}{2} u_{n}^{⊤} R_{n} u_{n}]}} + \underset{terminal cost}{\underset{⏟}{\frac{1}{2} x_{N}^{⊤} Q_{N} x_{N}}}

Assumptions:

$(A_{n}, B_{n})$ is controllable and $(A_{n}, C_{n})$ is observable
$Q_{n} ⪰ 0, R_{n} ⪰ 0, Q_{N} ⪰ 0$

Indirect Shooting: PMP Perspective

Problem Formulation and Optimality Conditions

Consider the deterministic discrete-time optimal control problem:

\begin{aligned} min_{x_{1 : N}, u_{1 : N - 1}} & \sum_{n = 1}^{N - 1} l (x_{n}, u_{n}) + l_{F} (x_{N}) \\ s.t. & x_{n + 1} = f (x_{n}, u_{n}) \\ u_{n} \in U \end{aligned}

The first-order necessary conditions for optimality can be derived using:

The Lagrangian framework (special case of KKT conditions)
Pontryagin's Minimum Principle (PMP)

Lagrangian Formulation

Form the Lagrangian:

L = \sum_{n = 1}^{N - 1} l (x_{n}, u_{n}) + λ_{n + 1}^{⊤} (f (x_{n}, u_{n}) - x_{n + 1}) + l_{F} (x_{N})

Define the Hamiltonian:

H (x_{n}, u_{n}, λ_{n + 1}) = l (x_{n}, u_{n}) + λ_{n + 1}^{⊤} f (x_{n}, u_{n})

Rewrite the Lagrangian using the Hamiltonian:

L = H (x_{1}, u_{1}, λ_{2}) + [\sum_{n = 2}^{N - 1} H (x_{n}, u_{n}, λ_{n + 1}) - λ_{n}^{⊤} x_{n}] + l_{F} (x_{N}) - λ_{N}^{⊤} x_{N}

Optimality Conditions

Take derivatives with respect to $x$ and $λ$ :

\begin{aligned} \frac{\partial L}{\partial λ_{n}} & = \frac{\partial H}{\partial λ_{n}} - x_{n + 1} = f (x_{n}, u_{n}) - x_{n + 1} = 0 \\ \frac{\partial L}{\partial x_{n}} & = \frac{\partial H}{\partial x_{n}} - λ_{n}^{⊤} = \frac{\partial l}{\partial x_{n}} + λ_{n + 1}^{⊤} \frac{\partial f}{\partial x_{n}} - λ_{n}^{⊤} = 0 \\ \frac{\partial L}{\partial x_{N}} & = \frac{\partial l_{F}}{\partial x_{N}} - λ_{N}^{⊤} = 0 \end{aligned}

For $u$ , we write the minimization explicitly to handle constraints:

\begin{aligned} u_{n} = & \arg min_{u} H (x_{n}, u, λ_{n + 1}) \\ s.t. u \in U \end{aligned}

Summary of Necessary Conditions

The first-order necessary conditions can be summarized as:

\begin{aligned} x_{n + 1} & = \nabla_{λ} H (x_{n}, u_{n}, λ_{n + 1}) \\ λ_{n} & = \nabla_{x} H (x_{n}, u_{n}, λ_{n + 1}) \\ u_{n} & = \arg min_{u} H (x_{n}, u, λ_{n + 1}), s.t. u \in U \\ λ_{N} & = \frac{\partial l_{F}}{\partial x_{N}} \end{aligned}

In continuous time, these become:

\begin{aligned} \dot{x} & = \nabla_{λ} H (x, u, λ) \\ - \dot{λ} & = \nabla_{x} H (x, u, λ) \\ u & = \arg min_{\tilde{u}} H (x, \tilde{u}, λ), s.t. \tilde{u} \in U \\ λ (t_{F}) & = \frac{\partial l_{F}}{\partial x} \end{aligned}

Application to LQR Problems

For LQR problems with quadratic cost and linear dynamics:

\begin{aligned} l (x_{n}, u_{n}) & = \frac{1}{2} (x_{n}^{⊤} Q_{n} x_{n} + u_{n}^{⊤} R_{n} u_{n}) \\ l_{F} (x_{N}) & = \frac{1}{2} x_{N}^{⊤} Q_{N} x_{N} \\ f (x_{n}, u_{n}) & = A_{n} x_{n} + B_{n} u_{n} \end{aligned}

The necessary conditions simplify to:

\begin{aligned} x_{n + 1} & = A_{n} x_{n} + B_{n} u_{n} \\ λ_{n} & = Q_{n} x_{n} + A_{n}^{⊤} λ_{n + 1} \\ λ_{N} & = Q_{N} x_{N} \\ u_{n} & = - R_{n}^{- 1} B_{n}^{⊤} λ_{n + 1} \end{aligned}

This forms a linear two-point boundary value problem.

Indirect Shooting Algorithm for LQR

Procedure:

Make initial guess for control sequence $u_{1 : N - 1}$
Forward pass: Simulate dynamics to get state trajectory $x_{1 : N}$
Backward pass:
- Set terminal costate: $λ_{N} = Q_{N} x_{N}$
- Compute costate trajectory: $λ_{n} = Q_{n} x_{n} + A_{n}^{⊤} λ_{n + 1}$
- Compute control adjustment: $Δ u_{n} = - R_{n}^{- 1} B_{n}^{⊤} λ_{n + 1} - u_{n}$
Line search: Update controls $u_{n} \leftarrow u_{n} + α Δ u_{n}$
Iterate until convergence

Direct Approach: QP Perspective

LQR as Quadratic Programming Problem

Assume $x_{1}$ is given, define the decision variable vector and the block-diagonal matrix:

z = [\begin{matrix} u_{1} \\ x_{2} \\ u_{2} \\ ⋮ \\ x_{N} \end{matrix}], H = [\begin{matrix} R_{1} \\ Q_{2} \\ R_{2} \\ ⋱ \\ Q_{N} \end{matrix}]

The dynamics constraints can be expressed as

\underset{C}{\underset{⏟}{[\begin{matrix} B_{1} & - I \\ A_{2} & B_{2} & - I \\ ⋱ \\ A_{N - 1} & B_{N - 1} & - I \end{matrix}]}} [\begin{matrix} u_{1} \\ x_{2} \\ ⋮ \\ x_{N} \end{matrix}] = \underset{d}{\underset{⏟}{[\begin{matrix} - A_{1} x_{1} \\ 0 \\ ⋮ \\ 0 \end{matrix}]}}

QP Formulation and KKT Conditions

The LQR problem becomes the QP:

min_{z} J = \frac{1}{2} z^{⊤} H z subject to C z = d

The Lagrangian of this QP is:

L (z, λ) = \frac{1}{2} z^{⊤} H z + λ^{⊤} (C z - d)

The KKT conditions are:

\begin{aligned} \nabla_{z} L & = H z + C^{⊤} λ = 0 \\ \nabla_{λ} L & = C z - d = 0 \end{aligned}

This leads to the linear system:

[\begin{matrix} H & C^{⊤} \\ C & 0 \end{matrix}] [\begin{matrix} z \\ λ \end{matrix}] = [\begin{matrix} 0 \\ d \end{matrix}]

We get the exact solution by solving one linear system!

Riccati Equation Solution

KKT System Structure for LQR

The KKT system for LQR has a highly structured sparse form, consider an $N = 4$ case:

[\begin{matrix} R_{1} & B_{1}^{T} \\ Q_{2} & I & A_{2}^{T} \\ R_{2} & B_{2}^{T} \\ Q_{3} & - I & A_{3}^{T} \\ R_{3} & B_{3}^{T} \\ Q_{4} & - I \\ B_{1} & - I \\ A_{2} & B_{2} & - I \\ A_{3} & B_{3} & - I \end{matrix}] [\begin{matrix} u_{1} \\ x_{2} \\ u_{2} \\ x_{3} \\ u_{3} \\ x_{4} \\ λ_{2} \\ λ_{3} \\ λ_{4} \end{matrix}] = [\begin{matrix} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ - A_{1} x_{1} \\ 0 \\ 0 \end{matrix}]

Deriving the Riccati Recursion

Start from the terminal condition (blue equation):

Q_{4} x_{4} - λ_{4} = 0 \Rightarrow λ_{4} = Q_{4} x_{4}

Move to the previous equation (red equation):

R_{3} u_{3} + B_{3}^{⊤} λ_{4} = R_{3} u_{3} + B_{3}^{⊤} Q_{4} x_{4} = 0

Substitute $x_{4} = A_{3} x_{3} + B_{3} u_{3}$ :

R_{3} u_{3} + B_{3}^{⊤} Q_{4} (A_{3} x_{3} + B_{3} u_{3}) = 0

Solve for $u_{3}$ :

u_{3} = - \underset{K_{3}}{\underset{⏟}{(R_{3} + B_{3}^{⊤} Q_{4} B_{3})^{- 1} B_{3}^{⊤} Q_{4} A_{3}}} x_{3}

Deriving the Riccati Recursion (Cont'd)

Now consider the green equation:

Q_{3} x_{3} - λ_{3} + A_{3}^{⊤} λ_{4} = 0

Substitute $λ_{4} = Q_{4} x_{4}$ and $x_{4} = A_{3} x_{3} + B_{3} u_{3}$ :

Q_{3} x_{3} - λ_{3} + A_{3}^{⊤} Q_{4} (A_{3} x_{3} + B_{3} u_{3}) = 0

Substitute $u_{3} = - K_{3} x_{3}$ :

Q_{3} x_{3} - λ_{3} + A_{3}^{⊤} Q_{4} (A_{3} x_{3} - B_{3} K_{3} x_{3}) = 0

Solve for $λ_{3}$ :

λ_{3} = \underset{P_{3}}{\underset{⏟}{(Q_{3} + A_{3}^{⊤} Q_{4} (A_{3} - B_{3} K_{3}))}} x_{3}

Riccati Recursion Formula

We now have a recursive relationship. Generalizing:

\begin{aligned} P_{N} & = Q_{N} \\ K_{k} & = (R_{k} + B_{k}^{⊤} P_{k + 1} B_{k})^{- 1} B_{k}^{⊤} P_{k + 1} A_{k} \\ P_{k} & = Q_{k} + A_{k}^{⊤} P_{k + 1} (A_{k} - B_{k} K_{k}) \end{aligned}

This is the celebrated Riccati equation.

The solution process involves:

A backward Riccati pass to compute $P_{k}$ and $K_{k}$ for $k = N - 1, \dots, 1$
A forward rollout to compute $x_{1 : N}$ and $u_{1 : N - 1}$ using $u_{k} = - K_{k} x_{k}$

Computational Complexity

Naive QP Solution: Treats problem as one big least-squares.

Computational cost: $O [N^{3} (n + m)^{3}]$
Must be re-solved from scratch for any change.

Riccati Recursion: Exploits the temporal structure.

Computational cost: $O [N (n + m)^{3}]$
Exponentially faster for long horizons (large $N$ ).

The Riccati Solution is More Than Just Fast:

It provides a ready-to-use feedback policy: $u_{k} = - K_{k} x_{k}$
This policy is adaptive: optimal for any initial state $x_{1}$ , not just a single one.
It enables real-time control by naturally rejecting disturbances.
And it delivers the exact same optimal solution as the QP.

Conclusion

Summary

Finite-Horizon Problems

Use Riccati recursion backward in time
Store gain matrices $K_{n}$
Apply time-varying feedback

Infinite-Horizon Problems

Solve algebraic Riccati equation offline
Use constant gain matrix $K_{\infty}$
Implement simple state feedback
Algebraic Riccati Equation (ARE): $P_{\infty} = Q + A^{⊤} P_{\infty} A - A^{⊤} P_{\infty} B (R + B^{⊤} P_{\infty} B)^{- 1} B^{⊤} P_{\infty} A$

Linear Quadratic Regulator in 3 Ways ​

Introduction ​

Problem Formulation ​

Indirect Shooting: PMP Perspective ​

Problem Formulation and Optimality Conditions ​

Lagrangian Formulation ​

Optimality Conditions ​

Summary of Necessary Conditions ​

Application to LQR Problems ​

Indirect Shooting Algorithm for LQR ​

Direct Approach: QP Perspective ​

LQR as Quadratic Programming Problem ​

QP Formulation and KKT Conditions ​

Riccati Equation Solution ​

KKT System Structure for LQR ​

Deriving the Riccati Recursion ​

Deriving the Riccati Recursion (Cont'd) ​

Riccati Recursion Formula ​

Computational Complexity ​

Conclusion ​

Summary ​