deepdraw.engine.adabound#
Implementation of the AdaBound optimizer.
<https://github.com/Luolc/AdaBound/blob/master/adabound/adabound.py>:
@inproceedings{Luo2019AdaBound,
author = {Luo, Liangchen and Xiong, Yuanhao and Liu, Yan and Sun, Xu},
title = {Adaptive Gradient Methods with Dynamic Bound of Learning Rate},
booktitle = {Proceedings of the 7th International Conference on Learning Representations},
month = {May},
year = {2019},
address = {New Orleans, Louisiana}
}
Classes
|
Implements the AdaBound algorithm. |
|
Implements AdaBound algorithm with Decoupled Weight Decay (See https://arxiv.org/abs/1711.05101) |
- class deepdraw.engine.adabound.AdaBound(params, lr=0.001, betas=(0.9, 0.999), final_lr=0.1, gamma=0.001, eps=1e-08, weight_decay=0, amsbound=False)[source]#
Bases:
Optimizer
Implements the AdaBound algorithm.
- Parameters:
params (list) – Iterable of parameters to optimize or dicts defining parameter groups
lr (
float
, optional) – Adam learning ratebetas (
tuple
, optional) – Coefficients (as a 2-tuple of floats) used for computing running averages of gradient and its squarefinal_lr (
float
, optional) – Final (SGD) learning rategamma (
float
, optional) – Convergence speed of the bound functionseps (
float
, optional) – Term added to the denominator to improve numerical stabilityweight_decay (
float
, optional) – Weight decay (L2 penalty)amsbound (
bool
, optional) – Whether to use the AMSBound variant of this algorithm
- class deepdraw.engine.adabound.AdaBoundW(params, lr=0.001, betas=(0.9, 0.999), final_lr=0.1, gamma=0.001, eps=1e-08, weight_decay=0, amsbound=False)[source]#
Bases:
Optimizer
Implements AdaBound algorithm with Decoupled Weight Decay (See https://arxiv.org/abs/1711.05101)
- Parameters:
params (list) – Iterable of parameters to optimize or dicts defining parameter groups
lr (
float
, optional) – Adam learning ratebetas (
tuple
, optional) – Coefficients (as a 2-tuple of floats) used for computing running averages of gradient and its squarefinal_lr (
float
, optional) – Final (SGD) learning rategamma (
float
, optional) – Convergence speed of the bound functionseps (
float
, optional) – Term added to the denominator to improve numerical stabilityweight_decay (
float
, optional) – Weight decay (L2 penalty)amsbound (
bool
, optional) – Whether to use the AMSBound variant of this algorithm