warp.optim.Adam#

class warp.optim.Adam(params=None, lr=0.001, betas=(0.9, 0.999), eps=1e-08)[source]#

Adaptive Moment Estimation (Adam) optimizer.

Adam is an adaptive learning rate optimization algorithm that computes individual learning rates for different parameters from estimates of first and second moments of the gradients. This implementation is designed for GPU-accelerated parameter updates using Warp kernels.

The algorithm maintains exponential moving averages of the gradient (first moment) and the squared gradient (second moment), using bias correction to account for their initialization at zero.

The interface is similar to PyTorch’s torch.optim.Adam.

Parameters:
  • params – List of warp.array objects to optimize. Can be None and set later via set_params(). Supported dtypes are warp.float16, warp.float32, and warp.vec3.

  • lr – Learning rate (step size).

  • betas – Coefficients for computing running averages of gradient and its square. Tuple of two floats (beta1, beta2) where beta1 is the exponential decay rate for the first moment and beta2 is the decay rate for the second moment.

  • eps – Small constant added to denominator for numerical stability.

__init__(params=None, lr=0.001, betas=(0.9, 0.999), eps=1e-08)[source]#

Methods

__init__([params, lr, betas, eps])

reset_internal_state()

set_params(params)

step(grad)

step_detail(g, m, v, lr, beta1, beta2, t, ...)

set_params(params)[source]#
reset_internal_state()[source]#
step(grad)[source]#
static step_detail(g, m, v, lr, beta1, beta2, t, eps, params)[source]#