warp.optim.Adam#
- class warp.optim.Adam(params=None, lr=0.001, betas=(0.9, 0.999), eps=1e-08)[source]#
Adaptive Moment Estimation (Adam) optimizer.
Adam is an adaptive learning rate optimization algorithm that computes individual learning rates for different parameters from estimates of first and second moments of the gradients. This implementation is designed for GPU-accelerated parameter updates using Warp kernels.
The algorithm maintains exponential moving averages of the gradient (first moment) and the squared gradient (second moment), using bias correction to account for their initialization at zero.
The interface is similar to PyTorch’s torch.optim.Adam.
- Parameters:
params – List of
warp.arrayobjects to optimize. Can beNoneand set later viaset_params(). Supported dtypes arewarp.float16,warp.float32, andwarp.vec3.lr – Learning rate (step size).
betas – Coefficients for computing running averages of gradient and its square. Tuple of two floats
(beta1, beta2)wherebeta1is the exponential decay rate for the first moment andbeta2is the decay rate for the second moment.eps – Small constant added to denominator for numerical stability.
Methods
__init__([params, lr, betas, eps])set_params(params)step(grad)step_detail(g, m, v, lr, beta1, beta2, t, ...)