mrpro.algorithms.optimizers.pgd

mrpro.algorithms.optimizers.pgd(f: Operator[Tensor, tuple[Tensor]] | Operator[Unpack[tuple[Tensor, ...]], tuple[Tensor]], g: ProximableFunctional | ProximableFunctionalSeparableSum, initial_value: Tensor | tuple[Tensor, ...], stepsize: float = 1.0, max_iterations: int = 128, backtrack_factor: float = 1.0, convergent_iterates_variant: bool = False, callback: Callable[[PGDStatus], None | bool] | None = None) → tuple[Tensor, ...][source]

Proximal gradient descent algorithm for solving problem \(min_x f(x) + g(x)\).

f is convex, differentiable, and with L-Lispchitz gradient. g is convex, possibly non-smooth with computable proximal map.

For fixed stepsize t, pgd converges globally when \(t \in (0, 1/L)\), where L is the Lipschitz constant of the gradient of f. In applications, f is often of the form \(f(x) = 1/2 \|Ax - y\|^2\), where \(A\) is a linear operator. In this case, \(t \in (0, 1/\|A\|_2^2)\) for convergence. If no backtracking is used, the fixed stepsize should be given accordingly to the convergence condition.

Example

L1 regularized image reconstruction. Problem formulation: \(min_x 1/2 ||Fx - y||^2 + \lambda ||x||_1\), with \(F\) being the Fourier Transform, target denoting the acquired data in k-space and x in image space, \(f(x) = 1/2 \|Fx - y\|^2, g(x) = \lambda \|x\|_1\). In this case, \(||F^T F|| = 1\).

kspace_data = torch.randn(3, 10, 10 , dtype=torch.complex64)
fft = FastFourierOp()
l2 = L2NormSquared(target=kspace_data)
f = l2 @ fft
reg_parameter = 0.01
g = reg_parameter * L1Norm()
operator_norm = 1.
stepsize = 0.85 * 1 / operator_norm**2
initial_value = torch.ones(3,10,10)
pgd_image_solution = pgd(
    f=f,
    g=g,
    initial_value=initial_value,
    stepsize=stepsize,
    max_iterations=200,
    backtrack_factor=1.0,
)

References

[BE2009] (1,2)

Beck A, Teboulle M (2009) A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences https://www.ceremade.dauphine.fr/~carlier/FISTA

[CHAM2015]

Chambolle A, Dossal C (2015) On the convergence of the iterates of “FISTA”. Journal of Optimization Theory and Applications https://inria.hal.science/hal-01060130v3

Parameters:

f (Union[Operator[Tensor, tuple[Tensor]], Operator[Unpack[tuple[Tensor, ...]], tuple[Tensor]]]) – convex, differentiable functional
g (ProximableFunctional | ProximableFunctionalSeparableSum) – convex, possibly non-smooth functional with computable proximal map
initial_value (Tensor | tuple[Tensor, ...]) – initial value for the solution of the algorithm
stepsize (float, default: 1.0) – stepsize needed in the gradient step, constant if no backtracking is used, otherwise it is the initial stepsize
max_iterations (int, default: 128) – number of iterations
backtrack_factor (float, default: 1.0) – must be \(<=1\). if \(<1.\), the backtracking rule for stepsize introduced by [BE2009] is used
convergent_iterates_variant (bool, default: False) – by default, the algorithm updates the variable t as originally described in [BE2009]. If set to True, the algorithm updates t as suggested by [CHAM2015], i.e. at iteration \(n\), \(t_n = \frac{n+a-1}{a}\), with chosen \(a=3\). This choice ensures the theoretical convergence of solution.
callback (Callable[[PGDStatus], None | bool] | None, default: None) – function to be called at each iteration. This can be used to monitor the progress of the algorithm. If it returns False, the algorithm stops at that iteration.

Returns:

an approximate solution of the minimization problem