`garage.torch.optimizers.differentiable_sgd`¶

Differentiable Stochastic Gradient Descent Optimizer.

Useful for algorithms such as MAML that needs the gradient of functions of post-updated parameters with respect to pre-updated parameters.

class DifferentiableSGD(module, lr=0.001)¶

Differentiable Stochastic Gradient Descent.

DifferentiableSGD performs the same optimization step as SGD, but instead of updating parameters in-place, it saves updated parameters in new tensors, so that the gradient of functions of new parameters can flow back to the pre-updated parameters.

Parameters

module (torch.nn.module) – A torch module whose parameters needs to be optimized.
lr (float) – Learning rate of stochastic gradient descent.

step()¶: Take an optimization step.

zero_grad()¶: Sets gradients of all model parameters to zero.

set_grads_none()¶

Sets gradients for all model parameters to None.

This is an alternative to zero_grad which sets gradients to zero.

garage.torch.optimizers.differentiable_sgd¶

`garage.torch.optimizers.differentiable_sgd`¶