warp.optim.SGD#
- class warp.optim.SGD(
- params=None,
- lr=0.001,
- momentum=0.0,
- dampening=0.0,
- weight_decay=0.0,
- nesterov=False,
Stochastic Gradient Descent (SGD) optimizer with optional momentum.
This optimizer implements gradient descent with support for momentum, Nesterov accelerated gradient, and weight decay (L2 regularization).
The interface is similar to PyTorch’s torch.optim.SGD.
- Parameters:
params – List of
warp.arrayobjects to optimize. Can beNoneand set later viaset_params().lr – Learning rate (step size).
momentum – Momentum factor for accelerating SGD in relevant directions.
dampening – Dampening factor applied to the momentum.
weight_decay – Weight decay coefficient (L2 regularization).
nesterov – Whether to use Nesterov momentum. Requires
momentum > 0anddampening = 0.
- __init__(
- params=None,
- lr=0.001,
- momentum=0.0,
- dampening=0.0,
- weight_decay=0.0,
- nesterov=False,
Methods
__init__([params, lr, momentum, dampening, ...])set_params(params)step(grad)step_detail(g, b, lr, momentum, dampening, ...)