mrpro.algorithms.optimizers.pgd
- mrpro.algorithms.optimizers.pgd(f: Operator[Tensor, tuple[Tensor]] | Operator[Unpack, tuple[Tensor]], g: ProximableFunctional | ProximableFunctionalSeparableSum, initial_value: Tensor | tuple[Tensor, ...], stepsize: float = 1.0, max_iterations: int = 128, backtrack_factor: float = 1.0, convergent_iterates_variant: bool = False, callback: Callable[[PGDStatus], None | bool] | None = None) tuple[Tensor, ...] [source]
Proximal gradient descent algorithm for solving problem \(min_x f(x) + g(x)\).
f is convex, differentiable, and with L-Lispchitz gradient. g is convex, possibly non-smooth with computable proximal map.
For fixed stepsize t, pgd converges globally when \(t \in (0, 1/L)\), where L is the Lipschitz constant of the gradient of f. In applications, f is often of the form \(f(x) = 1/2 \|Ax - y\|^2\), where \(A\) is a linear operator. In this case, \(t \in (0, 1/\|A\|_2^2)\) for convergence. If no backtracking is used, the fixed stepsize should be given accordingly to the convergence condition.
Example
L1 regularized image reconstruction. Problem formulation: \(min_x 1/2 ||Fx - y||^2 + \lambda ||x||_1\), with \(F\) being the Fourier Transform, target denoting the acquired data in k-space and x in image space, \(f(x) = 1/2 \|Fx - y\|^2, g(x) = \lambda \|x\|_1\). In this case, \(||F^T F|| = 1\).
kspace_data = torch.randn(3, 10, 10 , dtype=torch.complex64) fft = FastFourierOp() l2 = L2NormSquared(target=kspace_data) f = l2 @ fft reg_parameter = 0.01 g = reg_parameter * L1Norm() operator_norm = 1. stepsize = 0.85 * 1 / operator_norm**2 initial_value = torch.ones(3,10,10) pgd_image_solution = pgd( f=f, g=g, initial_value=initial_value, stepsize=stepsize, max_iterations=200, backtrack_factor=1.0, )
References
[BE2009] (1,2)Beck A, Teboulle M (2009) A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences https://www.ceremade.dauphine.fr/~carlier/FISTA
[CHAM2015]Chambolle A, Dossal C (2015) On the convergence of the iterates of “FISTA”. Journal of Optimization Theory and Applications https://inria.hal.science/hal-01060130v3
- Parameters:
f (
Union
[Operator
[Tensor
,tuple
[Tensor
]],Operator
[Unpack
[tuple
[Tensor
,...
]],tuple
[Tensor
]]]) – convex, differentiable functionalg (
ProximableFunctional
|ProximableFunctionalSeparableSum
) – convex, possibly non-smooth functional with computable proximal mapinitial_value (
Tensor
|tuple
[Tensor
,...
]) – initial value for the solution of the algorithmstepsize (
float
, default:1.0
) – stepsize needed in the gradient step, constant if no backtracking is used, otherwise it is the initial stepsizemax_iterations (
int
, default:128
) – number of iterationsbacktrack_factor (
float
, default:1.0
) – must be \(<=1\). if \(<1.\), the backtracking rule for stepsize introduced by [BE2009] is usedconvergent_iterates_variant (
bool
, default:False
) – by default, the algorithm updates the variable t as originally described in [BE2009]. If set toTrue
, the algorithm updates t as suggested by [CHAM2015], i.e. at iteration \(n\), \(t_n = \frac{n+a-1}{a}\), with chosen \(a=3\). This choice ensures the theoretical convergence of solution.callback (
Callable
[[PGDStatus
],None
|bool
] |None
, default:None
) – function to be called at each iteration. This can be used to monitor the progress of the algorithm. If it returnsFalse
, the algorithm stops at that iteration.
- Returns:
an approximate solution of the minimization problem