# garage.torch¶

PyTorch-backed modules and algorithms.

as_torch_dict(array_dict)

Convert a dict whose values are numpy arrays to PyTorch tensors.

Modifies array_dict in place.

Parameters

array_dict (dict) – Dictionary of data in numpy arrays

Returns

Dictionary of data in PyTorch tensors

Return type

dict

Advantages are a discounted cumulative sum.

The discounted cumulative sum can be computed using conv2d with filter. filter:

[1, (discount * gae_lambda), (discount * gae_lambda) ^ 2, …] where the length is same with max_episode_length.

baselines and rewards are also has same shape.

baselines: [ [b_11, b_12, b_13, … b_1n],

[b_21, b_22, b_23, … b_2n], … [b_m1, b_m2, b_m3, … b_mn] ]

rewards: [ [r_11, r_12, r_13, … r_1n],

[r_21, r_22, r_23, … r_2n], … [r_m1, r_m2, r_m3, … r_mn] ]

Parameters
• discount (float) – RL discount factor (i.e. gamma).

• gae_lambda (float) – Lambda, as used for Generalized Advantage Estimation (GAE).

• max_episode_length (int) – Maximum length of a single episode.

• baselines (torch.Tensor) – A 2D vector of value function estimates with shape (N, T), where N is the batch dimension (number of episodes) and T is the maximum episode length experienced by the agent. If an episode terminates in fewer than T time steps, the remaining elements in that episode should be set to 0.

• rewards (torch.Tensor) – A 2D vector of per-step rewards with shape (N, T), where N is the batch dimension (number of episodes) and T is the maximum episode length experienced by the agent. If an episode terminates in fewer than T time steps, the remaining elements in that episode should be set to 0.

Returns

A 2D vector of calculated advantage values with shape

(N, T), where N is the batch dimension (number of episodes) and T is the maximum episode length experienced by the agent. If an episode terminates in fewer than T time steps, the remaining values in that episode should be set to 0.

Return type

torch.Tensor

expand_var(name, item, n_expected, reference)

Expand a variable to an expected length.

This is used to handle arguments to primitives that can all be reasonably set to the same value, or multiple different values.

Parameters
• name (str) – Name of variable being expanded.

• item (any) – Element being expanded.

• n_expected (int) – Number of elements expected.

• reference (str) – Source of n_expected.

Returns

List of references to item or item itself.

Return type

list

Raises

ValueError – If the variable is a sequence but length of the variable is not 1 or n_expected.

filter_valids(tensor, valids)

Filter out tensor using valids (last index of valid tensors).

valids contains last indices of each rows.

Parameters
• tensor (torch.Tensor) – The tensor to filter

• valids (list[int]) – Array of length of the valid values

Returns

Filtered Tensor

Return type

torch.Tensor

flatten_batch(tensor)

Flatten a batch of observations.

Reshape a tensor of size (X, Y, Z) into (X*Y, Z)

Parameters

tensor (torch.Tensor) – Tensor to flatten.

Returns

Flattened tensor.

Return type

torch.Tensor

flatten_to_single_vector(tensor)

Collapse the C x H x W values per representation into a single long vector.

Reshape a tensor of size (N, C, H, W) into (N, C * H * W).

Parameters

tensor (torch.tensor) – batch of data.

Returns

Reshaped view of that data (analogous to numpy.reshape)

Return type

torch.Tensor

global_device()

Returns the global device that torch.Tensors should be placed on.

Note: The global device is set by using the function

garage.torch._functions.set_gpu_mode. If this functions is never called garage.torch._functions.device() returns None.

Returns

The global device that newly created torch.Tensors

should be placed on.

Return type

torch.Device

class NonLinearity(non_linear)

Bases: torch.nn.Module

Wrapper class for non linear function or module.

Parameters

non_linear (callable or type) – Non-linear function or type to be wrapped.

forward(self, input_value)

Forward method.

Parameters

input_value (torch.Tensor) – Input values

Returns

Output value

Return type

torch.Tensor

np_to_torch(array)

Numpy arrays to PyTorch tensors.

Parameters

array (np.ndarray) – Data in numpy array.

Returns

float tensor on the global device.

Return type

torch.Tensor

output_height_2d(layer, height)

Compute the output height of a torch.nn.Conv2d, assuming NCHW format.

This requires knowing the input height. Because NCHW format makes this very easy to mix up, this is a seperate function from conv2d_output_height.

It also works on torch.nn.MaxPool2d.

This function implements the formula described in the torch.nn.Conv2d documentation: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html

Parameters
• layer (torch.nn.Conv2d) – The layer to compute output size for.

• height (int) – The height of the input image.

Returns

The height of the output image.

Return type

int

output_width_2d(layer, width)

Compute the output width of a torch.nn.Conv2d, assuming NCHW format.

This requires knowing the input width. Because NCHW format makes this very easy to mix up, this is a seperate function from conv2d_output_height.

It also works on torch.nn.MaxPool2d.

This function implements the formula described in the torch.nn.Conv2d documentation: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html

Parameters
• layer (torch.nn.Conv2d) – The layer to compute output size for.

• width (int) – The width of the input image.

Returns

The width of the output image.

Return type

int

Pad val to last in nums in given axis.

length of the result in given axis should be total_length.

Raises

IndexError – If the input axis value is out of range of the nums array

Parameters
• nums (numpy.ndarray) – The array to pad.

• total_length (int) – The final width of the Array.

• axis (int) – Axis along which a sum is performed.

• val (int) – The value to set the padded value.

Returns

Return type

torch.Tensor

prefer_gpu()

Prefer to use GPU(s) if GPU(s) is detected.

product_of_gaussians(mus, sigmas_squared)

Compute mu, sigma of product of gaussians.

Parameters
• mus (torch.Tensor) – Means, with shape $$(N, M)$$. M is the number of mean values.

• sigmas_squared (torch.Tensor) – Variances, with shape $$(N, V)$$. V is the number of variance values.

Returns

Mu of product of gaussians, with shape $$(N, 1)$$. torch.Tensor: Sigma of product of gaussians, with shape $$(N, 1)$$.

Return type

torch.Tensor

set_gpu_mode(mode, gpu_id=0)

Set GPU mode and device ID.

Parameters
• mode (bool) – Whether or not to use GPU

• gpu_id (int) – GPU ID

soft_update_model(target_model, source_model, tau)

Update model parameter of target and source model.

# noqa: D417 :param target_model:

(garage.torch.Policy/garage.torch.QFunction):

Target model to update.

Parameters
• source_model

(garage.torch.Policy/QFunction):

Source network to update.

• tau (float) – Interpolation parameter for doing the soft target update.

state_dict_to(state_dict, device)

Move optimizer to a specified device.

Parameters
• state_dict (dict) – state dictionary to be moved

• device (str) – ID of GPU or CPU.

Returns

state dictionary moved to device

Return type

dict

torch_to_np(tensors)

Convert PyTorch tensors to numpy arrays.

Parameters

tensors (tuple) – Tuple of data in PyTorch tensors.

Returns

Tuple of data in numpy arrays.

Return type

tuple[numpy.ndarray]

Note: This method is deprecated and now replaced by

garage.torch._functions.to_numpy.

update_module_params(module, new_params)