garage.torch

PyTorch-backed modules and algorithms.

as_torch_dict(array_dict)

Convert a dict whose values are numpy arrays to PyTorch tensors.

Modifies array_dict in place.

Parameters

array_dict (dict) – Dictionary of data in numpy arrays

Returns

Dictionary of data in PyTorch tensors

Return type

dict

compute_advantages(discount, gae_lambda, max_episode_length, baselines, rewards)

Calculate advantages.

Advantages are a discounted cumulative sum.

Calculate advantages using a baseline according to Generalized Advantage Estimation (GAE)

The discounted cumulative sum can be computed using conv2d with filter. filter:

[1, (discount * gae_lambda), (discount * gae_lambda) ^ 2, …] where the length is same with max_episode_length.

baselines and rewards are also has same shape.

baselines: [ [b_11, b_12, b_13, … b_1n],

[b_21, b_22, b_23, … b_2n], … [b_m1, b_m2, b_m3, … b_mn] ]

rewards: [ [r_11, r_12, r_13, … r_1n],

[r_21, r_22, r_23, … r_2n], … [r_m1, r_m2, r_m3, … r_mn] ]

Parameters
  • discount (float) – RL discount factor (i.e. gamma).

  • gae_lambda (float) – Lambda, as used for Generalized Advantage Estimation (GAE).

  • max_episode_length (int) – Maximum length of a single episode.

  • baselines (torch.Tensor) – A 2D vector of value function estimates with shape (N, T), where N is the batch dimension (number of episodes) and T is the maximum episode length experienced by the agent. If an episode terminates in fewer than T time steps, the remaining elements in that episode should be set to 0.

  • rewards (torch.Tensor) – A 2D vector of per-step rewards with shape (N, T), where N is the batch dimension (number of episodes) and T is the maximum episode length experienced by the agent. If an episode terminates in fewer than T time steps, the remaining elements in that episode should be set to 0.

Returns

A 2D vector of calculated advantage values with shape

(N, T), where N is the batch dimension (number of episodes) and T is the maximum episode length experienced by the agent. If an episode terminates in fewer than T time steps, the remaining values in that episode should be set to 0.

Return type

torch.Tensor

expand_var(name, item, n_expected, reference)

Expand a variable to an expected length.

This is used to handle arguments to primitives that can all be reasonably set to the same value, or multiple different values.

Parameters
  • name (str) – Name of variable being expanded.

  • item (any) – Element being expanded.

  • n_expected (int) – Number of elements expected.

  • reference (str) – Source of n_expected.

Returns

List of references to item or item itself.

Return type

list

Raises

ValueError – If the variable is a sequence but length of the variable is not 1 or n_expected.

filter_valids(tensor, valids)

Filter out tensor using valids (last index of valid tensors).

valids contains last indices of each rows.

Parameters
  • tensor (torch.Tensor) – The tensor to filter

  • valids (list[int]) – Array of length of the valid values

Returns

Filtered Tensor

Return type

torch.Tensor

flatten_batch(tensor)

Flatten a batch of observations.

Reshape a tensor of size (X, Y, Z) into (X*Y, Z)

Parameters

tensor (torch.Tensor) – Tensor to flatten.

Returns

Flattened tensor.

Return type

torch.Tensor

flatten_to_single_vector(tensor)

Collapse the C x H x W values per representation into a single long vector.

Reshape a tensor of size (N, C, H, W) into (N, C * H * W).

Parameters

tensor (torch.tensor) – batch of data.

Returns

Reshaped view of that data (analogous to numpy.reshape)

Return type

torch.Tensor

global_device()

Returns the global device that torch.Tensors should be placed on.

Note: The global device is set by using the function

garage.torch._functions.set_gpu_mode. If this functions is never called garage.torch._functions.device() returns None.

Returns

The global device that newly created torch.Tensors

should be placed on.

Return type

torch.Device

class NonLinearity(non_linear)

Bases: torch.nn.Module

Inheritance diagram of garage.torch.NonLinearity

Wrapper class for non linear function or module.

Parameters

non_linear (callable or type) – Non-linear function or type to be wrapped.

forward(input_value)

Forward method.

Parameters

input_value (torch.Tensor) – Input values

Returns

Output value

Return type

torch.Tensor

np_to_torch(array)

Numpy arrays to PyTorch tensors.

Parameters

array (np.ndarray) – Data in numpy array.

Returns

float tensor on the global device.

Return type

torch.Tensor

output_height_2d(layer, height)

Compute the output height of a torch.nn.Conv2d, assuming NCHW format.

This requires knowing the input height. Because NCHW format makes this very easy to mix up, this is a seperate function from conv2d_output_height.

It also works on torch.nn.MaxPool2d.

This function implements the formula described in the torch.nn.Conv2d documentation: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html

Parameters
  • layer (torch.nn.Conv2d) – The layer to compute output size for.

  • height (int) – The height of the input image.

Returns

The height of the output image.

Return type

int

output_width_2d(layer, width)

Compute the output width of a torch.nn.Conv2d, assuming NCHW format.

This requires knowing the input width. Because NCHW format makes this very easy to mix up, this is a seperate function from conv2d_output_height.

It also works on torch.nn.MaxPool2d.

This function implements the formula described in the torch.nn.Conv2d documentation: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html

Parameters
  • layer (torch.nn.Conv2d) – The layer to compute output size for.

  • width (int) – The width of the input image.

Returns

The width of the output image.

Return type

int

pad_to_last(nums, total_length, axis=-1, val=0)

Pad val to last in nums in given axis.

length of the result in given axis should be total_length.

Raises

IndexError – If the input axis value is out of range of the nums array

Parameters
  • nums (numpy.ndarray) – The array to pad.

  • total_length (int) – The final width of the Array.

  • axis (int) – Axis along which a sum is performed.

  • val (int) – The value to set the padded value.

Returns

Padded array

Return type

torch.Tensor

prefer_gpu()

Prefer to use GPU(s) if GPU(s) is detected.

product_of_gaussians(mus, sigmas_squared)

Compute mu, sigma of product of gaussians.

Parameters
  • mus (torch.Tensor) – Means, with shape \((N, M)\). M is the number of mean values.

  • sigmas_squared (torch.Tensor) – Variances, with shape \((N, V)\). V is the number of variance values.

Returns

Mu of product of gaussians, with shape \((N, 1)\). torch.Tensor: Sigma of product of gaussians, with shape \((N, 1)\).

Return type

torch.Tensor

set_gpu_mode(mode, gpu_id=0)

Set GPU mode and device ID.

Parameters
  • mode (bool) – Whether or not to use GPU

  • gpu_id (int) – GPU ID

soft_update_model(target_model, source_model, tau)

Update model parameter of target and source model.

# noqa: D417 :param target_model:

(garage.torch.Policy/garage.torch.QFunction):

Target model to update.

Parameters
  • source_model

    (garage.torch.Policy/QFunction):

    Source network to update.

  • tau (float) – Interpolation parameter for doing the soft target update.

state_dict_to(state_dict, device)

Move optimizer to a specified device.

Parameters
  • state_dict (dict) – state dictionary to be moved

  • device (str) – ID of GPU or CPU.

Returns

state dictionary moved to device

Return type

dict

torch_to_np(tensors)

Convert PyTorch tensors to numpy arrays.

Parameters

tensors (tuple) – Tuple of data in PyTorch tensors.

Returns

Tuple of data in numpy arrays.

Return type

tuple[numpy.ndarray]

Note: This method is deprecated and now replaced by

garage.torch._functions.to_numpy.

update_module_params(module, new_params)

Load parameters to a module.

This function acts like torch.nn.Module._load_from_state_dict(), but it replaces the tensors in module with those in new_params, while _load_from_state_dict() loads only the value. Use this function so that the grad and grad_fn of new_params can be restored

Parameters
  • module (torch.nn.Module) – A torch module.

  • new_params (dict) – A dict of torch tensor used as the new parameters of this module. This parameters dict should be generated by torch.nn.Module.named_parameters()