PyTorch optimizers.

class ConjugateGradientOptimizer(params, max_constraint_value, cg_iters=10, max_backtracks=15, backtrack_ratio=0.8, hvp_reg_coeff=1e-05, accept_violation=False)

Bases: torch.optim.Optimizer

Inheritance diagram of garage.torch.optimizers.ConjugateGradientOptimizer

Performs constrained optimization via backtracking line search.

The search direction is computed using a conjugate gradient algorithm, which gives x = A^{-1}g, where A is a second order approximation of the constraint and g is the gradient of the loss function.

  • params (iterable) – Iterable of parameters to optimize.

  • max_constraint_value (float) – Maximum constraint value.

  • cg_iters (int) – The number of CG iterations used to calculate A^-1 g

  • max_backtracks (int) – Max number of iterations for backtrack linesearch.

  • backtrack_ratio (float) – backtrack ratio for backtracking line search.

  • hvp_reg_coeff (float) – A small value so that A -> A + reg*I. It is used by Hessian Vector Product calculation.

  • accept_violation (bool) – whether to accept the descent step if it violates the line search condition after exhausting all backtracking budgets.

property state

The hyper-parameters of the optimizer.



step(f_loss, f_constraint)

Take an optimization step.

  • f_loss (callable) – Function to compute the loss.

  • f_constraint (callable) – Function to compute the constraint value.

class DifferentiableSGD(module, lr=0.001)

Differentiable Stochastic Gradient Descent.

DifferentiableSGD performs the same optimization step as SGD, but instead of updating parameters in-place, it saves updated parameters in new tensors, so that the gradient of functions of new parameters can flow back to the pre-updated parameters.

  • module (torch.nn.module) – A torch module whose parameters needs to be optimized.

  • lr (float) – Learning rate of stochastic gradient descent.


Take an optimization step.


Sets gradients of all model parameters to zero.


Sets gradients for all model parameters to None.

This is an alternative to zero_grad which sets gradients to zero.

class OptimizerWrapper(optimizer, module, max_optimization_epochs=1, minibatch_size=None)

A wrapper class to handle torch.optim.optimizer.

  • optimizer (Union[type, tuple[type, dict]]) – Type of optimizer for policy. This can be an optimizer type such as torch.optim.Adam or a tuple of type and dictionary, where dictionary contains arguments to initialize the optimizer. e.g. (torch.optim.Adam, {‘lr’ : 1e-3}) Sample strategy to be used when sampling a new task.

  • module (torch.nn.Module) – Module to be optimized.

  • max_optimization_epochs (int) – Maximum number of epochs for update.

  • minibatch_size (int) – Batch size for optimization.


Yields a batch of inputs.

Notes: P is the size of minibatch (self._minibatch_size)


*inputs (list[torch.Tensor]) – A list of inputs. Each input has shape \((N \dot [T], *)\).



A list batch of inputs. Each batch has shape

\((P, *)\).


Clears the gradients of all optimized torch.Tensor s.


Performs a single optimization step.


**closure (callable, optional) – A closure that reevaluates the model and returns the loss.