`garage.tf.models`¶

Network Models.

class CategoricalCNNModel(output_dim, filters, strides, padding, name=None, is_image=True, hidden_sizes=(32, 32), hidden_nonlinearity=tf.nn.relu, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), output_nonlinearity=tf.nn.softmax, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), layer_normalization=False)¶

Bases: garage.tf.models.model.Model

Categorical CNN Model.

A model represented by a Categorical distribution which is parameterized by a convolutional neural network (CNN) followed by a multilayer perceptron (MLP).

Parameters

output_dim (int) – Dimension of the network output.
filters (Tuple[Tuple[int, Tuple[int, int]], ..]) – Number and dimension of filters. For example, ((3, (3, 5)), (32, (3, 3))) means there are two convolutional layers. The filter for the first layer have 3 channels and its shape is (3 x 5), while the filter for the second layer have 32 channels and its shape is (3 x 3).
strides (tuple[int]) – The stride of the sliding window. For example, (1, 2) means there are two convolutional layers. The stride of the filter for first layer is 1 and that of the second layer is 2.
padding (str) – The type of padding algorithm to use, either ‘SAME’ or ‘VALID’.
hidden_sizes (list[int]) – Output dimension of dense layer(s). For example, (32, 32) means this MLP consists of two hidden layers, each with 32 hidden units.
name (str) – Model name, also the variable scope.
is_image (bool) – Whether observations are images or not.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.
layer_normalization (bool) – Bool for using layer normalization or not.

network_output_spec(self)¶

Network output spec.

Returns: Name of the model outputs, in order.
Return type: list[str]

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network inputs.
Return type: list[str]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class CategoricalGRUModel(output_dim, hidden_dim, name=None, hidden_nonlinearity=tf.nn.tanh, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), recurrent_nonlinearity=tf.nn.sigmoid, recurrent_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_nonlinearity=tf.nn.softmax, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), hidden_state_init=tf.zeros_initializer(), hidden_state_init_trainable=False, layer_normalization=False)¶

Bases: garage.tf.models.gru_model.GRUModel

Categorical GRU Model.

A model represented by a Categorical distribution which is parameterized by a Gated Recurrent Unit (GRU).

Parameters

output_dim (int) – Dimension of the network output.
hidden_dim (int) – Hidden dimension for GRU cell.
name (str) – Policy name, also the variable scope.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.
recurrent_nonlinearity (callable) – Activation function for recurrent layers. It should return a tf.Tensor. Set it to None to maintain a linear activation.
recurrent_w_init (callable) – Initializer function for the weight of recurrent layer(s). The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.
hidden_state_init (callable) – Initializer function for the initial hidden state. The functino should return a tf.Tensor.
hidden_state_init_trainable (bool) – Bool for whether the initial hidden state is trainable.
layer_normalization (bool) – Bool for using layer normalization or not.

network_output_spec(self)¶

Network output spec.

Returns: Name of the model outputs, in order.
Return type: list[str]

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class CategoricalLSTMModel(output_dim, hidden_dim, name=None, hidden_nonlinearity=tf.nn.tanh, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), recurrent_nonlinearity=tf.nn.sigmoid, recurrent_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_nonlinearity=tf.nn.softmax, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), hidden_state_init=tf.zeros_initializer(), hidden_state_init_trainable=False, cell_state_init=tf.zeros_initializer(), cell_state_init_trainable=False, forget_bias=True, layer_normalization=False)¶

Bases: garage.tf.models.lstm_model.LSTMModel

Categorical LSTM Model.

A model represented by a Categorical distribution which is parameterized by a Long short-term memory (LSTM).

Parameters

output_dim (int) – Dimension of the network output.
hidden_dim (int) – Hidden dimension for LSTM cell.
name (str) – Policy name, also the variable scope.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.
recurrent_nonlinearity (callable) – Activation function for recurrent layers. It should return a tf.Tensor. Set it to None to maintain a linear activation.
recurrent_w_init (callable) – Initializer function for the weight of recurrent layer(s). The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.
hidden_state_init (callable) – Initializer function for the initial hidden state. The functino should return a tf.Tensor.
hidden_state_init_trainable (bool) – Bool for whether the initial hidden state is trainable.
cell_state_init (callable) – Initializer function for the initial cell state. The functino should return a tf.Tensor.
cell_state_init_trainable (bool) – Bool for whether the initial cell state is trainable.
forget_bias (bool) – If True, add 1 to the bias of the forget gate at initialization. It’s used to reduce the scale of forgetting at the beginning of the training.
layer_normalization (bool) – Bool for using layer normalization or not.

network_output_spec(self)¶

Network output spec.

Returns: Name of the model outputs, in order.
Return type: list[str]

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class CategoricalMLPModel(output_dim, name=None, hidden_sizes=(32, 32), hidden_nonlinearity=tf.nn.tanh, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), output_nonlinearity=tf.nn.softmax, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), layer_normalization=False)¶

Bases: garage.tf.models.mlp_model.MLPModel

Categorical MLP Model.

A model represented by a Categorical distribution which is parameterized by a multilayer perceptron (MLP).

Parameters

output_dim (int) – Dimension of the network output.
hidden_sizes (list[int]) – Output dimension of dense layer(s). For example, (32, 32) means this MLP consists of two hidden layers, each with 32 hidden units.
name (str) – Model name, also the variable scope.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.
layer_normalization (bool) – Bool for using layer normalization or not.

network_output_spec(self)¶

Network output spec.

Returns: Name of the model outputs, in order.
Return type: list[str]

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network inputs.
Return type: list[str]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class CNNMLPMergeModel(filters, strides, hidden_sizes=(256), output_dim=1, action_merge_layer=- 2, name=None, padding='SAME', max_pooling=False, pool_strides=(2, 2), pool_shapes=(2, 2), cnn_hidden_nonlinearity=tf.nn.relu, cnn_hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), cnn_hidden_b_init=tf.zeros_initializer(), hidden_nonlinearity=tf.nn.relu, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), output_nonlinearity=None, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), layer_normalization=False)¶

Bases: garage.tf.models.model.Model

Inheritance diagram of garage.tf.models.CNNMLPMergeModel

Convolutional neural network followed by a Multilayer Perceptron.

Combination of a CNN Model (optionally with max pooling) and an MLP Merge model. The CNN accepts the state as an input, while the MLP accepts the CNN’s output and the action as inputs.

Parameters

filters (Tuple[Tuple[int, Tuple[int, int]], ..]) – Number and dimension of filters. For example, ((3, (3, 5)), (32, (3, 3))) means there are two convolutional layers. The filter for the first layer have 3 channels and its shape is (3 x 5), while the filter for the second layer have 32 channels and its shape is (3 x 3).
strides (tuple[int]) – The stride of the sliding window. For example, (1, 2) means there are two convolutional layers. The stride of the filter for first layer is 1 and that of the second layer is 2.
hidden_sizes (tuple[int]) – Output dimension of dense layer(s). For example, (32, 32) means the MLP of this q-function consists of two hidden layers, each with 32 hidden units.
output_dim (int) – Dimension of the network output.
action_merge_layer (int) – The index of layers at which to concatenate action inputs with the network. The indexing works like standard python list indexing. Index of 0 refers to the input layer (observation input) while an index of -1 points to the last hidden layer. Default parameter points to second layer from the end.
name (str) – Model name, also the variable scope.
padding (str) – The type of padding algorithm to use, either ‘SAME’ or ‘VALID’.
max_pooling (bool) – Boolean for using max pooling layer or not.
pool_shapes (tuple[int]) – Dimension of the pooling layer(s). For example, (2, 2) means that all the pooling layers have shape (2, 2).
pool_strides (tuple[int]) – The strides of the pooling layer(s). For example, (2, 2) means that all the pooling layers have strides (2, 2).
cnn_hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s) in the CNN. It should return a tf.Tensor. Set it to None to maintain a linear activation.
cnn_hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s) in the CNN. Function should return a tf.Tensor.
cnn_hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s) in the CNN. Function should return a tf.Tensor.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s) in the MLP. It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s) in the MLP. The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s) in the MLP. The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer in the MLP. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s) in the MLP. The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s) in the MLP. The function should return a tf.Tensor.
layer_normalization (bool) – Bool for using layer normalization or not.

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network inputs.
Return type: list[str]

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

network_output_spec(self)¶

Network output spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class CNNModel(filters, strides, padding, name=None, hidden_nonlinearity=tf.nn.relu, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer())¶

Bases: garage.tf.models.model.Model

CNN Model.

Parameters

filters (Tuple[Tuple[int, Tuple[int, int]], ..]) – Number and dimension of filters. For example, ((3, (3, 5)), (32, (3, 3))) means there are two convolutional layers. The filter for the first layer have 3 channels and its shape is (3 x 5), while the filter for the second layer have 32 channels and its shape is (3 x 3).
strides (tuple[int]) – The stride of the sliding window. For example, (1, 2) means there are two convolutional layers. The stride of the filter for first layer is 1 and that of the second layer is 2.
name (str) – Model name, also the variable scope.
padding (str) – The type of padding algorithm to use, either ‘SAME’ or ‘VALID’.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network inputs.
Return type: list[str]

network_output_spec(self)¶

Network output spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class CNNModelWithMaxPooling(filters, strides, name=None, padding='SAME', pool_strides=(2, 2), pool_shapes=(2, 2), hidden_nonlinearity=tf.nn.relu, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer())¶

Bases: garage.tf.models.model.Model

CNN Model with max pooling.

Parameters

filters (Tuple[Tuple[int, Tuple[int, int]], ..]) – Number and dimension of filters. For example, ((3, (3, 5)), (32, (3, 3))) means there are two convolutional layers. The filter for the first layer have 3 channels and its shape is (3 x 5), while the filter for the second layer have 32 channels and its shape is (3 x 3).
strides (tuple[int]) – The stride of the sliding window. For example, (1, 2) means there are two convolutional layers. The stride of the filter for first layer is 1 and that of the second layer is 2.
name (str) – Model name, also the variable scope of the cnn.
padding (str) – The type of padding algorithm to use, either ‘SAME’ or ‘VALID’.
pool_strides (tuple[int]) – The strides of the pooling layer(s). For example, (2, 2) means that all the pooling layers have strides (2, 2).
pool_shapes (tuple[int]) – Dimension of the pooling layer(s). For example, (2, 2) means that all the pooling layers have shape (2, 2).
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network inputs.
Return type: list[str]

network_output_spec(self)¶

Network output spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class GaussianCNNModel(output_dim, filters, strides, padding, hidden_sizes, name=None, hidden_nonlinearity=tf.nn.tanh, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), output_nonlinearity=None, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), learn_std=True, adaptive_std=False, std_share_network=False, init_std=1.0, min_std=1e-06, max_std=None, std_filters=(), std_strides=(), std_padding='SAME', std_hidden_sizes=(32, 32), std_hidden_nonlinearity=tf.nn.tanh, std_hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), std_hidden_b_init=tf.zeros_initializer(), std_output_nonlinearity=None, std_output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), std_parameterization='exp', layer_normalization=False)¶

Bases: garage.tf.models.model.Model

GaussianCNNModel.

Parameters

filters (Tuple[Tuple[int, Tuple[int, int]], ..]) – Number and dimension of filters. For example, ((3, (3, 5)), (32, (3, 3))) means there are two convolutional layers. The filter for the first layer have 3 channels and its shape is (3 x 5), while the filter for the second layer have 32 channels and its shape is (3 x 3).
strides (tuple[int]) – The stride of the sliding window. For example, (1, 2) means there are two convolutional layers. The stride of the filter for first layer is 1 and that of the second layer is 2.
padding (str) – The type of padding algorithm to use, either ‘SAME’ or ‘VALID’.
output_dim (int) – Output dimension of the model.
name (str) – Model name, also the variable scope.
hidden_sizes (list[int]) – Output dimension of dense layer(s) for the Convolutional model for mean. For example, (32, 32) means the network consists of two dense layers, each with 32 hidden units.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.
learn_std (bool) – Is std trainable.
init_std (float) – Initial value for std.
adaptive_std (bool) – Is std a neural network. If False, it will be a parameter.
std_share_network (bool) – Boolean for whether mean and std share the same network.
std_filters (Tuple[Tuple[int, Tuple[int, int]], ..]) – Number and dimension of filters. For example, ((3, (3, 5)), (32, (3, 3))) means there are two convolutional layers. The filter for the first layer have 3 channels and its shape is (3 x 5), while the filter for the second layer have 32 channels and its shape is (3 x 3).
std_strides (tuple[int]) – The stride of the sliding window. For example, (1, 2) means there are two convolutional layers. The stride of the filter for first layer is 1 and that of the second layer is 2.
std_padding (str) – The type of padding algorithm to use in std network, either ‘SAME’ or ‘VALID’.
std_hidden_sizes (list[int]) – Output dimension of dense layer(s) for the Conv for std. For example, (32, 32) means the Conv consists of two hidden layers, each with 32 hidden units.
min_std (float) – If not None, the std is at least the value of min_std, to avoid numerical issues.
max_std (float) – If not None, the std is at most the value of max_std, to avoid numerical issues.
std_hidden_nonlinearity (callable) – Nonlinearity for each hidden layer in the std network.
std_hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s) in the std network.
std_hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s) in the std network.
std_output_nonlinearity (callable) – Activation function for output dense layer in the std network. It should return a tf.Tensor. Set it to None to maintain a linear activation.
std_output_w_init (callable) – Initializer function for the weight of output dense layer(s) in the std network.
std_parameterization (str) –
How the std should be parametrized. There are two options: - exp: the logarithm of the std will be stored, and applied a

exponential transformation
- softplus: the std will be computed as log(1+exp(x))
layer_normalization (bool) – Bool for using layer normalization or not.

network_output_spec(self)¶

Network output spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network inputs.
Return type: list[str]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class GaussianGRUModel(output_dim, hidden_dim=32, name='GaussianGRUModel', hidden_nonlinearity=tf.nn.tanh, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), recurrent_nonlinearity=tf.nn.sigmoid, recurrent_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_nonlinearity=None, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), hidden_state_init=tf.zeros_initializer(), hidden_state_init_trainable=False, learn_std=True, init_std=1.0, std_share_network=False, layer_normalization=False)¶

Bases: garage.tf.models.model.Model

Gaussian GRU Model.

A model represented by a Gaussian distribution which is parameterized by a Gated Recurrent Unit (GRU).

Parameters

output_dim (int) – Output dimension of the model.
hidden_dim (int) – Hidden dimension for GRU cell for mean.
name (str) – Model name, also the variable scope.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.
recurrent_nonlinearity (callable) – Activation function for recurrent layers. It should return a tf.Tensor. Set it to None to maintain a linear activation.
recurrent_w_init (callable) – Initializer function for the weight of recurrent layer(s). The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.
hidden_state_init (callable) – Initializer function for the initial hidden state. The functino should return a tf.Tensor.
hidden_state_init_trainable (bool) – Bool for whether the initial hidden state is trainable.
learn_std (bool) – Is std trainable.
init_std (float) – Initial value for std.
std_share_network (bool) – Boolean for whether mean and std share the same network.
layer_normalization (bool) – Bool for using layer normalization or not.

network_input_spec(self)¶

Network input spec.

Returns: Name of the model inputs, in order.
Return type: list[str]

network_output_spec(self)¶

Network output spec.

Returns: Name of the model outputs, in order.
Return type: list[str]

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class GaussianLSTMModel(output_dim, hidden_dim=32, name=None, hidden_nonlinearity=tf.nn.tanh, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), recurrent_nonlinearity=tf.nn.sigmoid, recurrent_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_nonlinearity=None, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), hidden_state_init=tf.zeros_initializer(), hidden_state_init_trainable=False, cell_state_init=tf.zeros_initializer(), cell_state_init_trainable=False, forget_bias=True, learn_std=True, init_std=1.0, std_share_network=False, layer_normalization=False)¶

Bases: garage.tf.models.model.Model

Gaussian LSTM Model.

A model represented by a Gaussian distribution which is parameterized by a Long short-term memory (LSTM).

Parameters

output_dim (int) – Output dimension of the model.
hidden_dim (int) – Hidden dimension for LSTM cell for mean.
name (str) – Model name, also the variable scope.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.
recurrent_nonlinearity (callable) – Activation function for recurrent layers. It should return a tf.Tensor. Set it to None to maintain a linear activation.
recurrent_w_init (callable) – Initializer function for the weight of recurrent layer(s). The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.
hidden_state_init (callable) – Initializer function for the initial hidden state. The functino should return a tf.Tensor.
hidden_state_init_trainable (bool) – Bool for whether the initial hidden state is trainable.
cell_state_init (callable) – Initializer function for the initial cell state. The functino should return a tf.Tensor.
cell_state_init_trainable (bool) – Bool for whether the initial cell state is trainable.
forget_bias (bool) – If True, add 1 to the bias of the forget gate at initialization. It’s used to reduce the scale of forgetting at the beginning of the training.
learn_std (bool) – Is std trainable.
init_std (float) – Initial value for std.
std_share_network (bool) – Boolean for whether mean and std share the same network.
layer_normalization (bool) – Bool for using layer normalization or not.

network_input_spec(self)¶

Network input spec.

Returns: Name of the model inputs, in order.
Return type: list[str]

network_output_spec(self)¶

Network output spec.

Returns: Name of the model outputs, in order.
Return type: list[str]

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class GaussianMLPModel(output_dim, name=None, hidden_sizes=(32, 32), hidden_nonlinearity=tf.nn.tanh, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), output_nonlinearity=None, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), learn_std=True, adaptive_std=False, std_share_network=False, init_std=1.0, min_std=1e-06, max_std=None, std_hidden_sizes=(32, 32), std_hidden_nonlinearity=tf.nn.tanh, std_hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), std_hidden_b_init=tf.zeros_initializer(), std_output_nonlinearity=None, std_output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), std_parameterization='exp', layer_normalization=False)¶

Bases: garage.tf.models.model.Model

Gaussian MLP Model.

A model represented by a Gaussian distribution which is parameterized by a multilayer perceptron (MLP).

Parameters

output_dim (int) – Output dimension of the model.
name (str) – Model name, also the variable scope.
hidden_sizes (list[int]) – Output dimension of dense layer(s) for the MLP for mean. For example, (32, 32) means the MLP consists of two hidden layers, each with 32 hidden units.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.
learn_std (bool) – Is std trainable.
init_std (float) – Initial value for std.
adaptive_std (bool) – Is std a neural network. If False, it will be a parameter.
std_share_network (bool) – Boolean for whether mean and std share the same network.
std_hidden_sizes (list[int]) – Output dimension of dense layer(s) for the MLP for std. For example, (32, 32) means the MLP consists of two hidden layers, each with 32 hidden units.
min_std (float) – If not None, the std is at least the value of min_std, to avoid numerical issues.
max_std (float) – If not None, the std is at most the value of max_std, to avoid numerical issues.
std_hidden_nonlinearity (callable) – Nonlinearity for each hidden layer in the std network.
std_hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s) in the std network. The function should return a tf.Tensor.
std_hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s) in the std network. The function should return a tf.Tensor.
std_output_nonlinearity (callable) – Activation function for output dense layer in the std network. It should return a tf.Tensor. Set it to None to maintain a linear activation.
std_output_w_init (callable) – Initializer function for the weight of output dense layer(s) in the std network.
std_parameterization (str) –
How the std should be parametrized. There are two options: - exp: the logarithm of the std will be stored, and applied a

exponential transformation
- softplus: the std will be computed as log(1+exp(x))
layer_normalization (bool) – Bool for using layer normalization or not.

network_output_spec(self)¶

Network output spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network inputs.
Return type: list[str]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class GRUModel(output_dim, hidden_dim, name=None, hidden_nonlinearity=tf.nn.tanh, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), recurrent_nonlinearity=tf.nn.sigmoid, recurrent_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_nonlinearity=None, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), hidden_state_init=tf.zeros_initializer(), hidden_state_init_trainable=False, layer_normalization=False)¶

Bases: garage.tf.models.model.Model

GRU Model.

Parameters

output_dim (int) – Dimension of the network output.
hidden_dim (int) – Hidden dimension for GRU cell.
name (str) – Policy name, also the variable scope.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.
recurrent_nonlinearity (callable) – Activation function for recurrent layers. It should return a tf.Tensor. Set it to None to maintain a linear activation.
recurrent_w_init (callable) – Initializer function for the weight of recurrent layer(s). The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.
hidden_state_init (callable) – Initializer function for the initial hidden state. The functino should return a tf.Tensor.
hidden_state_init_trainable (bool) – Bool for whether the initial hidden state is trainable.
layer_normalization (bool) – Bool for using layer normalization or not.

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

network_output_spec(self)¶

Network output spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class LSTMModel(output_dim, hidden_dim, name=None, hidden_nonlinearity=tf.nn.tanh, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), recurrent_nonlinearity=tf.nn.sigmoid, recurrent_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_nonlinearity=None, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), hidden_state_init=tf.zeros_initializer(), hidden_state_init_trainable=False, cell_state_init=tf.zeros_initializer(), cell_state_init_trainable=False, forget_bias=True, layer_normalization=False)¶

Bases: garage.tf.models.model.Model

LSTM Model.

Parameters

output_dim (int) – Dimension of the network output.
hidden_dim (int) – Hidden dimension for LSTM cell.
name (str) – Policy name, also the variable scope.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.
recurrent_nonlinearity (callable) – Activation function for recurrent layers. It should return a tf.Tensor. Set it to None to maintain a linear activation.
recurrent_w_init (callable) – Initializer function for the weight of recurrent layer(s). The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.
hidden_state_init (callable) – Initializer function for the initial hidden state. The functino should return a tf.Tensor.
hidden_state_init_trainable (bool) – Bool for whether the initial hidden state is trainable.
cell_state_init (callable) – Initializer function for the initial cell state. The functino should return a tf.Tensor.
cell_state_init_trainable (bool) – Bool for whether the initial cell state is trainable.
forget_bias (bool) – If True, add 1 to the bias of the forget gate at initialization. It’s used to reduce the scale of forgetting at the beginning of the training.
layer_normalization (bool) – Bool for using layer normalization or not.

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

network_output_spec(self)¶

Network output spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class MLPDuelingModel(output_dim, name=None, hidden_sizes=(32, 32), hidden_nonlinearity=tf.nn.relu, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), output_nonlinearity=None, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), layer_normalization=False)¶

Bases: garage.tf.models.model.Model

Inheritance diagram of garage.tf.models.MLPDuelingModel

MLP Model with dueling network structure.

Parameters

output_dim (int) – Dimension of the network output.
hidden_sizes (list[int]) – Output dimension of dense layer(s). For example, (32, 32) means this MLP consists of two hidden layers, each with 32 hidden units.
name (str) – Model name, also the variable scope.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.
layer_normalization (bool) – Bool for using layer normalization or not.

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network inputs.
Return type: list[str]

network_output_spec(self)¶

Network output spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class MLPMergeModel(output_dim, name='MLPMergeModel', hidden_sizes=(32, 32), concat_layer=- 2, hidden_nonlinearity=tf.nn.relu, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), output_nonlinearity=None, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), layer_normalization=False)¶

Bases: garage.tf.models.model.Model

MLP Merge Model.

Parameters

output_dim (int) – Dimension of the network output.
name (str) – Model name, also the variable scope.
hidden_sizes (list[int]) – Output dimension of dense layer(s). For example, (32, 32) means this MLP consists of two hidden layers, each with 32 hidden units.
concat_layer (int) – The index of layers at which to concatenate input_var2 with the network. The indexing works like standard python list indexing. Index of 0 refers to the input layer (input_var1) while an index of -1 points to the last hidden layer. Default parameter points to second layer from the end.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.
layer_normalization (bool) – Bool for using layer normalization or not.

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

network_output_spec(self)¶

Network output spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class MLPModel(output_dim, name='MLPModel', hidden_sizes=(32, 32), hidden_nonlinearity=tf.nn.relu, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), output_nonlinearity=None, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), layer_normalization=False)¶

Bases: garage.tf.models.model.Model

MLP Model.

Parameters

output_dim (int) – Dimension of the network output.
hidden_sizes (list[int]) – Output dimension of dense layer(s). For example, (32, 32) means this MLP consists of two hidden layers, each with 32 hidden units.
name (str) – Model name, also the variable scope.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.
layer_normalization (bool) – Bool for using layer normalization or not.

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network inputs.
Return type: list[str]

network_output_spec(self)¶

Network output spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class BaseModel¶

Bases: abc.ABC

Inheritance diagram of garage.tf.models.BaseModel

Interface-only abstract class for models.

A Model contains the structure/configuration of a set of computation graphs, or can be understood as a set of networks. Using a model requires calling build() with given input placeholder, which can be either tf.compat.v1.placeholder, or the output from another model. This makes composition of complex models with simple models much easier.

Examples

model = SimpleModel(output_dim=2) # To use a model, first create a placeholder. # In the case of TensorFlow, we create a tf.compat.v1.placeholder. input_ph = tf.compat.v1.placeholder(tf.float32, shape=(None, 2))

# Building the model output = model.build(input_ph)

# We can also pass the output of a model to another model. # Here we pass the output from the above SimpleModel object. model_2 = ComplexModel(output_dim=2) output_2 = model_2.build(output)

build(self, *inputs, name=None)¶

Output of model with the given input placeholder(s).

This function is implemented by subclasses to create their computation graphs, which will be managed by Model. Generally, subclasses should implement build() directly.

Parameters

inputs (object) – Input(s) for the model.
name (str) – Name of the model.

Returns

Output(s) of the model.

Return type

list[tf.Tensor]

property name(self)¶: Name for this Model.

property parameters(self)¶

Parameters of the Model.

The output of a model is determined by its parameter. It could be the weights of a neural network model or parameters of a loss function model.

Returns: Parameters.
Return type: list[tf.Tensor]

class Model(name)¶

Bases: garage.tf.models.model.BaseModel, garage.tf.models.module.Module

Inheritance diagram of garage.tf.models.Model

Model class for TensorFlow.

A TfModel only contains the structure/configuration of the underlying computation graphs. Connectivity information are all in Network class. A TfModel contains zero or more Network.

When a Network is created, it reuses the parameter from the model. If a Network is built without given a name, the name “default” will be used.

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

Pickling is handled automatcailly. The target weights should be assigned to self._default_parameters before pickling, so that the newly created model can check if target weights exist or not. When unpickled, the unserialized model will load the weights from self._default_parameters.

The design is illustrated as the following:

input_1 input_2

|

============== Model (TfModel)=================== | | | | | | Parameters | | | ============= / ============ | | | default | / | Network2 | | | | (Network) |/ |(Network) | | | ============= ============ | | | | | =================================================

|

|

(outputs from ‘default’ networks) |: outputs from [‘Network2’] network

Examples are also available in tests/garage/tf/models/test_model.

Parameters: name (str) – Name of the model. It will also become the variable scope of the model. Every model should have a unique name.

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network inputs.
Return type: list[str]

network_output_spec(self)¶

Network output spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class Module(name)¶

Bases: abc.ABC

Inheritance diagram of garage.tf.models.Module

A module that builds on top of model.

Parameters: name (str) – Module name, also the variable scope.

property name(self)¶: str: Name of this module.

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class StochasticModule(name)¶

Bases: garage.tf.models.module.Module

Stochastic Module.

property distribution(self)¶: Distribution.

property name(self)¶: str: Name of this module.

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class NormalizedInputMLPModel(input_shape, output_dim, name='NormalizedInputMLPModel', hidden_sizes=(32, 32), hidden_nonlinearity=tf.nn.relu, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), output_nonlinearity=None, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), layer_normalization=False)¶

Bases: garage.tf.models.mlp_model.MLPModel

Inheritance diagram of garage.tf.models.NormalizedInputMLPModel

NormalizedInputMLPModel based on garage.tf.models.Model class.

This class normalized the inputs and pass the normalized input to a MLP model, which can be used to perform linear regression to the outputs.

Parameters

input_shape (tuple[int]) – Input shape of the training data.
output_dim (int) – Output dimension of the model.
name (str) – Model name, also the variable scope.
hidden_sizes (list[int]) – Output dimension of dense layer(s) for the MLP for mean. For example, (32, 32) means the MLP consists of two hidden layers, each with 32 hidden units.
hidden_nonlinearity (callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.
hidden_w_init (callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.
hidden_b_init (callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.
output_nonlinearity (callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.
output_w_init (callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.
output_b_init (callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.
layer_normalization (bool) – Bool for using layer normalization or not.

network_output_spec(self)¶

Network output spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network inputs.
Return type: list[str]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

property input(self)¶

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns: Default input of the model.
Return type: tf.Tensor

property output(self)¶

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns: Default output of the model.
Return type: tf.Tensor

property inputs(self)¶

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns: Default inputs of the model.
Return type: list[tf.Tensor]

property outputs(self)¶

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns: Default outputs of the model.
Return type: list[tf.Tensor]

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

class Sequential(*models, name=None)¶

Bases: garage.tf.models.model.Model

Inheritance diagram of garage.tf.models.Sequential

Sequential Model.

Parameters

name (str) – Model name, also the variable scope.
models (list[garage.tf.models.Model]) – The models to be connected in sequential order.

property input(self)¶: tf.Tensor: input of the model by default.

property output(self)¶: tf.Tensor: output of the model by default.

property inputs(self)¶: tf.Tensor: inputs of the model by default.

property outputs(self)¶: tf.Tensor: outputs of the model by default.

build(self, *inputs, name=None)¶

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters

inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).
name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given: inputs.

Return type

list[tf.Tensor]

network_input_spec(self)¶

Network input spec.

Returns: List of key(str) for the network inputs.
Return type: list[str]

network_output_spec(self)¶

Network output spec.

Returns: List of key(str) for the network outputs.
Return type: list[str]

property parameters(self)¶

Parameters of the model.

Returns: Parameters
Return type: np.ndarray

property name(self)¶

Name (str) of the model.

This is also the variable scope of the model.

Returns: Name of the model.
Return type: str

reset(self, do_resets=None)¶

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters: do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)¶

State info specification.

Returns

keys and shapes for the information related to the: module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)¶

State info keys.

Returns

keys for the information related to the module’s state: when taking an input.

Return type

List[str]

terminate(self)¶: Clean up operation.

get_trainable_vars(self)¶

Get trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_global_vars(self)¶

Get global variables.

Returns

A list of global variables in the current: variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)¶

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the: current variable scope.

Return type

List[tf.Variable]

get_params(self)¶

Get the trainable variables.

Returns

A list of trainable variables in the current: variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)¶

Get parameter shapes.

Returns: A list of variable shapes.
Return type: List[tuple]

get_param_values(self)¶

Get param values.

Returns

Values of the parameters evaluated in: the current session

Return type

np.ndarray

set_param_values(self, param_values)¶

Set param values.

Parameters: param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)¶

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the: shapes specified.

Return type

List[np.ndarray]

garage.tf.models¶

`garage.tf.models`¶