garage.np
¶
Reinforcement Learning Algorithms which use NumPy as a numerical backend.
- discount_cumsum(x, discount)[source]¶
Discounted cumulative sum.
See https://docs.scipy.org/doc/scipy/reference/tutorial/signal.html#difference-equation-filtering # noqa: E501 Here, we have y[t] - discount*y[t+1] = x[t] or rev(y)[t] - discount*rev(y)[t-1] = rev(x)[t]
- Parameters
x (np.ndarrary) – Input.
discount (float) – Discount factor.
- Returns
Discounted cumulative sum.
- Return type
np.ndarrary
- explained_variance_1d(ypred, y, valids=None)[source]¶
Explained variation for 1D inputs.
It is the proportion of the variance in one variable that is explained or predicted from another variable.
- Parameters
ypred (np.ndarray) – Sample data from the first variable. Shape: \((N, max_episode_length)\).
y (np.ndarray) – Sample data from the second variable. Shape: \((N, max_episode_length)\).
valids (np.ndarray) – Optional argument. Array indicating valid indices. If None, it assumes the entire input array are valid. Shape: \((N, max_episode_length)\).
- Returns
The explained variance.
- Return type
- flatten_tensors(tensors)[source]¶
Flatten a list of tensors.
- Parameters
tensors (list[numpy.ndarray]) – List of tensors to be flattened.
- Returns
Flattened tensors.
- Return type
numpy.ndarray
Example:
>>> flatten_tensors([np.ndarray([1]), np.ndarray([1])]) array(...)
- pad_batch_array(array, lengths, max_length=None)[source]¶
Convert a packed into a padded array with one more dimension.
- pad_tensor_n(xs, max_len)[source]¶
Pad array of tensors.
- Parameters
xs (numpy.ndarray) – Tensors to be padded.
max_len (int) – Maximum length.
- Returns
Padded tensor.
- Return type
numpy.ndarray
- rrse(actual, predicted)[source]¶
Root Relative Squared Error.
- Parameters
actual (np.ndarray) – The actual value.
predicted (np.ndarray) – The predicted value.
- Returns
- The root relative square error between the actual and the
predicted value.
- Return type
- slice_nested_dict(dict_or_array, start, stop)[source]¶
Slice a dictionary containing arrays (or dictionaries).
This function is primarily intended for un-batching env_infos and action_infos.
- Parameters
dict_or_array (dict[str, dict or np.ndarray] or np.ndarray) – A nested dictionary should only contain dictionaries and numpy arrays (recursively).
start (int) – First index to be included in the slice.
stop (int) – First index to be excluded from the slice. In other words, these are typical python slice indices.
- Returns
The input, but sliced.
- Return type
dict or np.ndarray
- sliding_window(t, window, smear=False)[source]¶
Create a sliding window over a tensor.
- Parameters
- Returns
- All windows generate over t, with shape \((M, W, D)\),
where W is the window size. If smear if False, M is \(N-W+1\), otherwise M is N.
- Return type
np.ndarray
- Raises
NotImplementedError – If step_size is not 1.
ValueError – If window size is larger than the input tensor.
- stack_and_pad_tensor_dict_list(tensor_dict_list, max_len)[source]¶
Stack and pad array of list of tensors.
Input paths are a list of N dicts, each with values of shape \((D, S^*)\). This function stack and pad the values with the input key with max_len, so output will be shape \((N, D, S^*)\).
- Parameters
- Returns
- a dictionary of {stacked tensors or dictionary of
stacked tensors}. Shape: \((N, D, S^*)\) where N is the len of input paths.
- Return type
- stack_tensor_dict_list(tensor_dict_list)[source]¶
Stack a list of dictionaries of {tensors or dictionary of tensors}.