This page contains general usage information about POGS. For specific instructions about installing and running the C++ code or MATLAB and R wrappers, see the Setup section on the left-hand side.

Input Variables

The solver takes 3 input variables.

  • 1
    A
    
    is the data matrix in the constraint \(y = Ax\). Currently only dense matrices are supported, but a sparse matrix implementation is on its way.
  • 1
    f
    
    and
    1
    g
    
    are structures representing the terms \(f(y)\) and \(g(x)\) in the objective function. It is assumed that \(f(y)\) and \(g(x)\) are separable and that each \(f_i(y_i)\) (resp. \(g_j(x_j)\)) can be represented by 5 parameters.

Parameters

The solver takes 6 parameters. They are all optional.

  • 1
    rel_tol
    
    (default
    1
    1e-3
    
    ). The relative stopping criteria (see section below on stopping conditions).
  • 1
    abs_tol
    
    (default
    1
    1e-4
    
    ). The absolute stopping criteria (see section below on stopping conditions).
  • 1
    rho
    
    (default
    1
    1.0
    
    ). See section section below on choosing \( \rho \).
  • 1
    adaptive_rho
    
    (default
    1
    true
    
    ). Adaptively choose
    1
    rho
    
    . If you have found a good value of rho that works for your problem, then it may make more sense to turn off adaptive rho updating. Otherwise it is highly recommended to leave this turned on.
  • 1
    max_iter
    
    (default
    1
    1000
    
    ). The maximum number of iterations before the solver terminates. Most problems should terminate within ~500 iterations, but there is no guarantee.
  • 1
    quiet
    
    (default
    1
    false
    
    ). Displays progress information every 10th iteration if set to
    1
    false
    
    .

Output Variables

The solver returns 3 solution variables.

  • 1
    x
    
    and
    1
    y
    
    are the primal solution variable \(x^\star\\text{pogs}\) and \(y^\star\\text{pogs}\).
  • 1
    l
    
    is the dual solution variable \(\lambda^\star\_\text{pogs}\) corresponding to the constraint \(Ax = y\).

Warm Starting / Factors

POGS pre-computes various factors relating to the matrix \(A\) before beginning to minimize \(f + g\) (this is known as factorization caching). As a result the first iteration will take a long time compared to subsequent iterations. Therefore, if you plan to solve multiple optimization problems with the same matrix \(A\) but different functions \(f\) and \(g\), then POGS can perform the factorization once and re-use it in subsequent iterations. To use this feature simply feed the factors back in the next time you call POGS. The exact process depends on whether you’re using C++ or one of the wrappers.

Stopping Criteria

The stopping criteria used by POGS is similar to the one used by DIMACS and the SeDuMi solver. They are

\[ \begin{aligned} \|r^{k}\| & \leq \epsilon_{\text{abs}} \sqrt{m} +\epsilon_{\text{rel}} \| (x^k, y^k) \| \\
\|s^{k}\| & \leq \epsilon_{\text{abs}} \sqrt{n} +\epsilon_{\text{rel}} \| (\lambda^k, \mu^k) \| \\
\ | p^k - d^k| & \leq \epsilon_{\text{abs}} + \epsilon_{\text{rel}} | p^k | \\
\end{aligned} \]

where \( r^k = Ax^k - y^k \) and \( s^k = A^T\lambda^k + \mu^k \) are the primal resp. dual residuals at the \( k \)’th iteration. The scalars \( p^k \) and \( d^k \) are the primal and dual optimal values, making \( p^k - d^k \) a pseduo-gap.

Choosing \( \rho \)

The convergence rate of POGS is tightly coupled with the choice of penalty parameter \( \rho \). Optimal choices of \( \rho \) can range from \( 10^{-3} \) to \( 10^{3} \) and in general objective functions with lower curvature are more sensitive to the choice of \( \rho \). We recommend letting POGS automatically choose \( \rho \). Advanced users may prefer to choose \( \rho \) themselves for best performance.

\(h\)-Functions

A total of 16 different functions that can be specified for \(h\). They are

kAbs – Absolute Value
Function: \(h(x) = |x|\).
Domain: \( \mathbf{R} \).
Uses: \(L_1\)-norm problems (eg. to promote sparsity).
kNegEntr – Negative Entropy
Function: \(h(x) = -x \log(x)\).
Domain: \( \mathbf{R}_+ \).
Uses: Useful in problems concerning probability distributions.
kExp – Exponential
Function: \(h(x) = e^x\).
Domain: \( \mathbf{R} \).
Uses: Useful as soft version of \(I(x \leq 0) \).
kHuber – Huber Loss
Function: \( h(x) = \left\{\begin{aligned} &(1/2)x^2 & |x| \leq 1 \\\ &|x| - (1/2) & |x| > 1 \end{aligned} \right. \).
Domain: \( \mathbf{R} \).
Uses: Robust estimation (use as robust alternative to \((1/2)x^2\)).
kIdentity – Identity
Function: \(h(x) = x\).
Domain: \( \mathbf{R} \).
Uses: Typical in linear programs and quadratic programs.
kIndBox01 – Indicator of [0, 1]-Box
Function: \(h(x) = I(0 \leq x \leq 1)\).
Domain: \( [0, 1] \).
Uses: Box constraint.
kIndEq0 – Indicator of the Origin
Function: \(h(x) = I(x = 0)\).
Domain: \( \{ 0 \} \).
Uses: Equality constraint.
kIndGe0 – Indicator of the Non-Negative Orthant
Function: \(h(x) = I(x \geq 0)\).
Domain: \( \mathbf{R}_+ \).
Uses: Inequality constraint.
kIndLe0 – Indicator of the Non-Positive Orthant
Function: \(h(x) = I(x \leq 0)\).
Domain: \( \mathbf{R}_- \).
Uses: Inequality constraint.
kLogistic – Integral of the Logistic Function
Function: \(h(x) = \log(1 + e^x)\).
Domain: \( \mathbf{R} \).
Uses: Logistic Regression.
kMaxNeg0 – Negative Part
Function: \( h(x) = \max(0, -x) \).
Domain: \( \mathbf{R} \).
Uses: Hingle loss (eg. SVM).
kMaxPos0 – Positive Part
Function: \( h(x) = \max(0, x) \).
Domain: \( \mathbf{R} \).
Uses: Hingle loss (eg. SVM).
kNegLog – Negative Log
Function: \( h(x) = -\log(x) \).
Domain: \( \mathbf{R}_+ \).
Uses: Analytic centering or barrier function (eg. for solving convex subproblem).
kSquare – Square
Function: \( h(x) = (1/2)x^2 \).
Domain: \( \mathbf{R} \).
Uses: \(L_2\)-norm problems (eg. least squares, ridge regression, etc.).
kRecipr – Reciprocal
Function: \( h(x) = 1/x \).
Domain: \( \mathbf{R}_+ \).
Uses: Analytic centering or barrier function (eg. for solving convex subproblem).
kZero – Zero
Function: \( h(x) = 0 \).
Domain: \( \mathbf{R} \).
Uses: Objective does not depend on specific term.