This is a high level description of the Alternating Direction Method of Multipliers (ADMM) specific to graph form problems. For more detail we recommmend reading the papers in the references section.

### Augmented Lagrangian

Consider the convex optimiztion problem

\begin{equation} \begin{aligned} &\text{minimize} & & f(x) \\
& \text{subject to} & & x \in \mathcal{C}, \end{aligned} ~~~~~~~~~~~~~~~~~~~~~~~~~\text{(1)} \end{equation}

where $$f$$ is a convex function and $$\mathcal{C}$$ is a convex set. This is equivalent to

\begin{aligned} &\text{minimize} & & f(x) + I(z \in \mathcal{C}) \\ & \text{subject to} & & x = z, \end{aligned} ~~~~~~~~~~\text{(2)}

where $$I(z \in \mathcal{C})$$ is the indicator function of the convex set $$\mathcal{C}$$. We can augment the objective by $$(\rho/2) \|x - z\|^2$$ without changing the problem

\begin{aligned} &\text{minimize} & & f(x) + I(z \in \mathcal{C}) + (\rho/2) \|x - z\|^2 \\ & \text{subject to} & & x = z. \end{aligned}

Re-writing the problem using Lagrange multipliers yields

$\inf_{x,z} \sup_y~~f(x) + I(z \in \mathcal{C}) + y^T(x - z) + (\rho/2) \|x - z\|^2_2$

The function $$L_\rho(x,y,z) = f(x) + I(z \in \mathcal{C}) + y^T(x - z) + (\rho/2) \|x - z\|^2$$ is known as the augmented Lagrangian of (2).

### Alternating Direction Method of Multipliers

Consider the problem of finding $$(x^\star, y^\star, z^\star)$$ such that

$(x^\star, y^\star, z^\star) = \arg\inf_{x,z} \sup_y L_\rho(x, y, z).$

Solving this problem is equivalent to solving the original problem and is therefore just as hard. ADMM consists of alternating between minimizing $$L_\rho$$ over $$x$$, then $$z$$ and lastly taking a gradient step with respect to $$y$$. The algorithm can be written as

\begin{aligned} x^{k+1} &=\arg\min_x~L_\rho(x, y^k, z^k) \\ z^{k+1} &=\arg\min_z~L_\rho(x^{k+1}, y^k, z) \\ y^{k+1} &= y^k + \rho (x^{k+1} - z^{k+1}), \end{aligned}

which can be simplified to

\begin{aligned} x^{k+1} &=\mathbf{prox}_f(z^k - (1 / \rho) y^k) \\ z^{k+1} &=\Pi_\mathcal{C}(x^{k+1} + (1/\rho)y^{k}) \\ y^{k+1} &= y^k + \rho (x^{k+1} - z^{k+1}), \end{aligned}

where $$\Pi_\mathcal{C}(\cdot)$$ is the orthogonal projection onto the convex set $$\mathcal{C}$$ and $$\mathbf{prox}_f$$ is the proximal operator, defined as

$\mathbf{prox}_{f}(v) = \arg\min_{x}\Big(f(x) + (\rho/2) \|x - v\|_2^2\Big).$

Note, this is a simplified version of ADMM.

### ADMM applied to Graph Form Problems

To apply ADMM to graph form problems use the equivalence table

\begin{aligned} &\text{Problem (1)} & && \text{Graph Form} \\ &x & \leftrightarrow && (x, y) \\ &f & \leftrightarrow && f + g \\ &\mathcal{C} & \leftrightarrow && \{ (x, y) ~|~ y = Ax \} \end{aligned}

Further simplifications

• The constraint set in graph form problems is affine, and as a result, the projection $$\Pi_\mathcal{C}(\cdot)$$ can be expressed as a matrix multiplication.
• We make the assumption that $$f$$ and $$g$$ are separable, which means the proximal operator can be expressed as $\mathbf{prox}_f(v) = \big(\mathbf{prox}_{f_1}(v_1), \ldots, \mathbf{prox}_{f_m}(v_m)\big)$ In POGS, we introduce a library of proximal operators (see 1<pogs>/src/prox_lib.h ), thereby hiding the implementation details from the user.

### Pros and Cons of using ADMM

Reasons why ADMM can be much faster than conventional methods

• Alternating minimization makes it possible to perform much of the work in parallel.
• By using factorization caching, a large portion of the work can be re-used between iterations.

• The convergence rate is highly dependent on the choice of $$\rho$$.