POGS Architecture¶
Technical overview of POGS architecture and design.
Overview¶
POGS is built around the Alternating Direction Method of Multipliers (ADMM) algorithm for solving convex optimization problems.
ADMM Algorithm¶
POGS solves problems in graph form:
where \(f\) and \(g\) are convex, separable functions.
Algorithm Steps¶
The ADMM algorithm iterates:
-
x-update (proximal operator for g): $$ x^{k+1} = \text{prox}_{g,\rho}(x^k - AT\lambdak) $$
-
y-update (proximal operator for f): $$ y^{k+1} = \text{prox}_{f,\rho}(Ax^{k+1} + \lambda^k/\rho) $$
-
Dual update: $$ \lambda^{k+1} = \lambda^k + \rho(Ax^{k+1} - y^{k+1}) $$
Convergence¶
Stops when both primal and dual residuals are small:
- Primal residual: \(r = Ax - y\)
- Dual residual: \(s = \rho A^T(y^{k+1} - y^k)\)
Code Structure¶
High-Level Organization¶
src/
├── common/ # Shared code
│ ├── admm_state.hpp # RAII memory management
│ └── admm_core.hpp # Common ADMM logic (future)
├── cpu/ # CPU implementation
│ ├── pogs.cpp # Main ADMM solver
│ ├── projector.cu # Projection operations
│ └── include/
│ ├── gsl/ # GSL wrappers (BLAS/LAPACK)
│ └── anderson.h # Anderson acceleration (experimental)
├── gpu/ # GPU implementation
│ ├── pogs.cu # GPU ADMM solver
│ └── kernels.cuh # CUDA kernels
└── interface_c/ # C interface
├── pogs_c.h
└── pogs_c.cpp
Key Classes¶
Modern C++20 API (include/pogs/):
Legacy API (src/cpu/pogs.cpp):
Matrix Abstraction¶
Matrix Interface¶
template<typename M>
concept Matrix = requires(M m) {
{ m.rows() } -> std::convertible_to<size_t>;
{ m.cols() } -> std::convertible_to<size_t>;
};
Implementations¶
Dense Matrix (src/include/matrix/matrix_dense.h): - Column-major or row-major storage - BLAS/LAPACK operations - O(mn) memory
Sparse Matrix (src/include/matrix/matrix_sparse.h): - CSR/CSC format - Sparse matrix-vector products - O(nnz) memory
Proximal Operators¶
Function Library¶
Location: src/include/prox_lib.h
Each function type has a proximal operator:
Supported Functions¶
- Indicators: IndEq0, IndGe0, IndLe0, IndBox01
- Norms: Abs, Square
- Nonlinear: Logistic, Huber, Exp, NegLog
- Other: Identity, Zero, MaxPos0, MaxNeg0, Recipr, NegEntr
Implementation Pattern¶
template<typename T>
inline void ProxAbs(T rho, T *x) {
// Soft thresholding
T threshold = 1.0 / rho;
if (*x > threshold) {
*x -= threshold;
} else if (*x < -threshold) {
*x += threshold;
} else {
*x = 0;
}
}
Cone Projections¶
Cone Library¶
Location: src/include/prox_lib_cone.h
Projections onto convex cones:
Supported Cones¶
- Zero: \(\{x : x = 0\}\) → projection: \(x \gets 0\)
- Non-negative: \(\{x : x \geq 0\}\) → projection: \(x \gets \max(x, 0)\)
- SOC: \(\{(t,x) : \|x\|_2 \leq t\}\) → analytical formula
- SDP: \(\{X : X \succeq 0\}\) → eigenvalue decomposition
SDP Projection¶
Algorithm (lines 144-230 in prox_lib_cone.h):
1. Compute eigenvalue decomposition: X = V*Λ*V^T
2. Project eigenvalues: Λ_+ = max(Λ, 0)
3. Reconstruct: X_+ = V*Λ_+*V^T
Uses LAPACK: dsyevd (double) or ssyevd (float).
Linear Algebra Backend¶
GSL Wrapper¶
Location: src/cpu/include/gsl/gsl_linalg.h
Provides unified interface to BLAS/LAPACK:
- Matrix-vector:
gemv - Matrix-matrix:
gemm - Eigenvalues:
syevd - QR decomposition:
geqrf,orgqr
Platform-Specific¶
macOS: Uses Accelerate framework Linux: Uses OpenBLAS or ATLAS Windows: Uses Intel MKL or OpenBLAS
Memory Management¶
Modern C++20 (include/pogs/)¶
RAII Everywhere:
Smart Pointers:
auto solver = pogs::make_solver<double>(std::move(matrix));
// Automatic cleanup when solver goes out of scope
Legacy Code (src/cpu/)¶
Manual Management (being modernized):
Adaptive Parameters¶
Adaptive ρ¶
Adjusts penalty parameter based on residual balance:
if (primal_residual > 10 * dual_residual) {
rho *= 2.0; // Increase ρ
} else if (dual_residual > 10 * primal_residual) {
rho /= 2.0; // Decrease ρ
}
Prevents oscillation and improves convergence.
Over-Relaxation¶
Uses relaxation parameter α = 1.7:
Can speed up convergence for some problems.
GPU Implementation¶
CUDA Kernels¶
Location: src/gpu/kernels.cuh
Parallel proximal operators:
__global__ void prox_abs_kernel(T *x, T rho, size_t n) {
size_t idx = blockIdx.x * blockDim.x + threadIdx.x;
if (idx < n) {
// Soft thresholding
ProxAbs(rho, &x[idx]);
}
}
cuBLAS/cuSOLVER¶
Uses NVIDIA libraries for linear algebra: - cuBLAS: Matrix operations - cuSOLVER: Eigenvalue decomposition (future)
Build System¶
CMake Structure¶
pogs/
├── CMakeLists.txt # Root
├── src/CMakeLists.txt # Library
├── tests/CMakeLists.txt # Tests
└── examples/CMakeLists.txt # Examples
Targets¶
- pogs::cpu: CPU solver library
- pogs::gpu: GPU solver library (optional)
- test_cone: CPU cone solver tests
- test_sdp: SDP cone projection tests
Testing¶
Test Files¶
tests/test_cone.cpp: Cone projection teststests/test_sdp.cpp: SDP-specific testsexamples/: Integration tests
Running Tests¶
Future Improvements¶
Phase 3+ (Planned)¶
- Code Deduplication: Extract common ADMM core from CPU/GPU
- Policy-Based Design: Backend traits for CPU/GPU
- Concepts: Template constraints for type safety
- std::span: Safe array views throughout
- Coroutines: Async solver interface (C++20)
Research Features¶
- Anderson Acceleration: Improve integration with ADMM
- Warm Starting: Better initialization strategies
- Problem Detection: Automatic parameter selection
- Preconditioning: Improve conditioning
See Also¶
- Building from Source - Build instructions
- Contributing - How to contribute
- Modernization - Modernization progress