garage.torch.optimizers.differentiable_sgd
¶
Differentiable Stochastic Gradient Descent Optimizer.
Useful for algorithms such as MAML that needs the gradient of functions of post-updated parameters with respect to pre-updated parameters.
-
class
DifferentiableSGD
(module, lr=0.001)¶ Differentiable Stochastic Gradient Descent.
DifferentiableSGD performs the same optimization step as SGD, but instead of updating parameters in-place, it saves updated parameters in new tensors, so that the gradient of functions of new parameters can flow back to the pre-updated parameters.
- Parameters
module (torch.nn.module) – A torch module whose parameters needs to be optimized.
lr (float) – Learning rate of stochastic gradient descent.
-
step
(self)¶ Take an optimization step.
-
zero_grad
(self)¶ Sets gradients of all model parameters to zero.