## ADMM

This is a high level description of the Alternating Direction Method of Multipliers (ADMM) specific to graph form problems. For more detail we recommmend reading the papers in the references section.

### Augmented Lagrangian

Consider the convex optimiztion problem

\begin{equation}
\begin{aligned}
&\text{minimize}
& & f(x) \\

& \text{subject to}
& & x \in \mathcal{C},
\end{aligned} ~~~~~~~~~~~~~~~~~~~~~~~~~\text{(1)}
\end{equation}

where \( f \) is a convex function and \( \mathcal{C} \) is a convex set. This is equivalent to

\[
\begin{aligned}
&\text{minimize}
& & f(x) + I(z \in \mathcal{C}) \\

& \text{subject to}
& & x = z,
\end{aligned} ~~~~~~~~~~\text{(2)}
\]

where \( I(z \in \mathcal{C}) \) is the indicator function of the convex set \(\mathcal{C}\). We can augment the objective by \( (\rho/2) \|x - z\|^2 \) without changing the problem

\[
\begin{aligned}
&\text{minimize}
& & f(x) + I(z \in \mathcal{C}) + (\rho/2) \|x - z\|^2 \\

& \text{subject to}
& & x = z.
\end{aligned}
\]

Re-writing the problem using Lagrange multipliers yields

\[ \inf_{x,z} \sup_y~~f(x) + I(z \in \mathcal{C}) + y^T(x - z) + (\rho/2) \|x - z\|^2_2 \]

The function \( L_\rho(x,y,z) = f(x) + I(z \in \mathcal{C}) + y^T(x - z) + (\rho/2) \|x - z\|^2 \) is known as the *augmented Lagrangian* of (2).

### Alternating Direction Method of Multipliers

Consider the problem of finding \( (x^\star, y^\star, z^\star) \) such that

\[ (x^\star, y^\star, z^\star) = \arg\inf_{x,z} \sup_y L_\rho(x, y, z). \]

Solving this problem is equivalent to solving the original problem and is therefore just as hard. ADMM consists of alternating between minimizing \( L_\rho \) over \( x \), then \( z \) and lastly taking a gradient step with respect to \( y \). The algorithm can be written as

\[
\begin{aligned}
x^{k+1} &=\arg\min_x~L_\rho(x, y^k, z^k) \\

z^{k+1} &=\arg\min_z~L_\rho(x^{k+1}, y^k, z) \\

y^{k+1} &= y^k + \rho (x^{k+1} - z^{k+1}),
\end{aligned}
\]

which can be simplified to

\[
\begin{aligned}
x^{k+1} &=\mathbf{prox}_f(z^k - (1 / \rho) y^k) \\

z^{k+1} &=\Pi_\mathcal{C}(x^{k+1} + (1/\rho)y^{k}) \\

y^{k+1} &= y^k + \rho (x^{k+1} - z^{k+1}),
\end{aligned}
\]

where \( \Pi_\mathcal{C}(\cdot) \) is the orthogonal projection onto the convex set \( \mathcal{C} \) and \( \mathbf{prox}_f\) is the *proximal operator*, defined as

\[ \mathbf{prox}_{f}(v) = \arg\min_{x}\Big(f(x) + (\rho/2) \|x - v\|_2^2\Big). \]

*Note*, this is a simplified version of ADMM.

### ADMM applied to Graph Form Problems

To apply ADMM to graph form problems use the equivalence table

\[
\begin{aligned}
&\text{Problem (1)} & && \text{Graph Form} \\

&x & \leftrightarrow && (x, y) \\

&f & \leftrightarrow && f + g \\

&\mathcal{C} & \leftrightarrow && \{ (x, y) ~|~ y = Ax \}
\end{aligned}
\]

Further simplifications

- The constraint set in graph form problems is affine, and as a result, the projection \( \Pi_\mathcal{C}(\cdot) \) can be expressed as a matrix multiplication.
- We make the assumption that \( f \) and \( g \) are separable, which means the proximal operator can be expressed as
\[
\mathbf{prox}_f(v) = \big(\mathbf{prox}_{f_1}(v_1), \ldots, \mathbf{prox}_{f_m}(v_m)\big)
\]
In POGS, we introduce a library of proximal operators (see

), thereby hiding the implementation details from the user.1

`<pogs>/src/prox_lib.h`

### Pros and Cons of using ADMM

Reasons why ADMM can be much faster than conventional methods

- Alternating minimization makes it possible to perform much of the work in parallel.
- By using factorization caching, a large portion of the work can be re-used between iterations.

Some considerations when using ADMM

- The convergence rate is highly dependent on the choice of \( \rho \).
- Convergence can be slow if you require more than 3-4 digits of accuracy.