warp.optim.SGD#

class warp.optim.SGD( params=None, lr=0.001, momentum=0.0, dampening=0.0, weight_decay=0.0, nesterov=False, )[source]#

Stochastic Gradient Descent (SGD) optimizer with optional momentum.

This optimizer implements gradient descent with support for momentum, Nesterov accelerated gradient, and weight decay (L2 regularization).

The interface is similar to PyTorch’s torch.optim.SGD.

Parameters:

params – List of warp.array objects to optimize. Can be None and set later via set_params().
lr – Learning rate (step size).
momentum – Momentum factor for accelerating SGD in relevant directions.
dampening – Dampening factor applied to the momentum.
weight_decay – Weight decay coefficient (L2 regularization).
nesterov – Whether to use Nesterov momentum. Requires momentum > 0 and dampening = 0.

__init__( params=None, lr=0.001, momentum=0.0, dampening=0.0, weight_decay=0.0, nesterov=False, )[source]#

Methods

`__init__`([params, lr, momentum, dampening, ...])
`reset_internal_state`()	Reset momentum buffers and timestep to zero.
`set_params`(params)	Set parameters to optimize and allocate momentum buffers.
`step`(grad)	Apply one SGD step using the provided gradients.
`step_detail`(g, b, lr, momentum, dampening, ...)	Apply an SGD update to a single parameter array.

set_params(params)[source]#

Set parameters to optimize and allocate momentum buffers.

Parameters:: params – List of warp.array objects to optimize, or None.

reset_internal_state()[source]#: Reset momentum buffers and timestep to zero.

step(grad)[source]#

Apply one SGD step using the provided gradients.

static step_detail( g, b, lr, momentum, dampening, weight_decay, nesterov, t, params, )[source]#

Apply an SGD update to a single parameter array.

Parameters: