garage.tf.baselines.gaussian_cnn_baseline

A baseline based on a GaussianCNN model.

class GaussianCNNBaseline(env_spec, filters, strides, padding, hidden_sizes, hidden_nonlinearity=tf.nn.tanh, hidden_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), hidden_b_init=tf.zeros_initializer(), output_nonlinearity=None, output_w_init=tf.initializers.glorot_uniform(seed=deterministic.get_tf_seed_stream()), output_b_init=tf.zeros_initializer(), name='GaussianCNNBaseline', learn_std=True, init_std=1.0, adaptive_std=False, std_share_network=False, std_filters=(), std_strides=(), std_padding='SAME', std_hidden_sizes=(), std_hidden_nonlinearity=None, std_output_nonlinearity=None, layer_normalization=False, normalize_inputs=True, normalize_outputs=True, subsample_factor=1.0, optimizer=None, optimizer_args=None, use_trust_region=True, max_kl_step=0.01)

Bases: garage.tf.baselines.gaussian_cnn_baseline_model.GaussianCNNBaselineModel, garage.np.baselines.baseline.Baseline

Inheritance diagram of garage.tf.baselines.gaussian_cnn_baseline.GaussianCNNBaseline

Fits a Gaussian distribution to the outputs of a CNN.

Parameters
  • env_spec (garage.envs.env_spec.EnvSpec) – Environment specification.

  • filters (Tuple[Tuple[int, Tuple[int, int]], ..]) – Number and dimension of filters. For example, ((3, (3, 5)), (32, (3, 3))) means there are two convolutional layers. The filter for the first layer have 3 channels and its shape is (3 x 5), while the filter for the second layer have 32 channels and its shape is (3 x 3).

  • strides (tuple[int]) – The stride of the sliding window. For example, (1, 2) means there are two convolutional layers. The stride of the filter for first layer is 1 and that of the second layer is 2.

  • padding (str) – The type of padding algorithm to use, either ‘SAME’ or ‘VALID’.

  • name (str) – Model name, also the variable scope.

  • hidden_sizes (list[int]) – Output dimension of dense layer(s) for the Convolutional model for mean. For example, (32, 32) means the network consists of two dense layers, each with 32 hidden units.

  • hidden_nonlinearity (Callable) – Activation function for intermediate dense layer(s). It should return a tf.Tensor. Set it to None to maintain a linear activation.

  • hidden_w_init (Callable) – Initializer function for the weight of intermediate dense layer(s). The function should return a tf.Tensor.

  • hidden_b_init (Callable) – Initializer function for the bias of intermediate dense layer(s). The function should return a tf.Tensor.

  • output_nonlinearity (Callable) – Activation function for output dense layer. It should return a tf.Tensor. Set it to None to maintain a linear activation.

  • output_w_init (Callable) – Initializer function for the weight of output dense layer(s). The function should return a tf.Tensor.

  • output_b_init (Callable) – Initializer function for the bias of output dense layer(s). The function should return a tf.Tensor.

  • name – Name of this model (also used as its scope).

  • learn_std (bool) – Whether to train the standard deviation parameter of the Gaussian distribution.

  • init_std (float) – Initial standard deviation for the Gaussian distribution.

  • adaptive_std (bool) – Whether to use a neural network to learn the standard deviation of the Gaussian distribution. Unless True, the standard deviation is learned as a parameter which is not conditioned on the inputs.

  • std_share_network (bool) – Boolean for whether the mean and standard deviation models share a CNN network. If True, each is a head from a single body network. Otherwise, the parameters are estimated using the outputs of two indepedent networks.

  • std_filters (Tuple[Tuple[int, Tuple[int, int]], ..]) – Number and dimension of filters. For example, ((3, (3, 5)), (32, (3, 3))) means there are two convolutional layers. The filter for the first layer have 3 channels and its shape is (3 x 5), while the filter for the second layer have 32 channels and its shape is (3 x 3).

  • std_strides (tuple[int]) – The stride of the sliding window. For example, (1, 2) means there are two convolutional layers. The stride of the filter for first layer is 1 and that of the second layer is 2.

  • std_padding (str) – The type of padding algorithm to use in std network, either ‘SAME’ or ‘VALID’.

  • std_hidden_sizes (list[int]) – Output dimension of dense layer(s) for the Conv for std. For example, (32, 32) means the Conv consists of two hidden layers, each with 32 hidden units.

  • std_hidden_nonlinearity (callable) – Nonlinearity for each hidden layer in the std network.

  • std_output_nonlinearity (Callable) – Activation function for output dense layer in the std network. It should return a tf.Tensor. Set it to None to maintain a linear activation.

  • layer_normalization (bool) – Bool for using layer normalization or not.

  • normalize_inputs (bool) – Bool for normalizing inputs or not.

  • normalize_outputs (bool) – Bool for normalizing outputs or not.

  • subsample_factor (float) – The factor to subsample the data. By default it is 1.0, which means using all the data.

  • optimizer (garage.tf.Optimizer) – Optimizer used for fitting the model.

  • optimizer_args (dict) – Arguments for the optimizer. Default is None, which means no arguments.

  • use_trust_region (bool) – Whether to use a KL-divergence constraint.

  • max_kl_step (float) – KL divergence constraint for each iteration, if use_trust_region is active.

fit(self, paths)

Fit regressor based on paths.

Parameters

paths (dict[numpy.ndarray]) – Sample paths.

predict(self, paths)

Predict ys based on input xs.

Parameters

paths (dict[numpy.ndarray]) – Sample paths.

Returns

The predicted ys.

Return type

numpy.ndarray

clone_model(self, name)

Return a clone of the GaussianCNNBaselineModel.

It copies the configuration of the primitive and also the parameters.

Parameters

name (str) – Name of the newly created model. It has to be different from source policy if cloned under the same computational graph.

Returns

Newly cloned model.

Return type

garage.tf.baselines.GaussianCNNBaselineModel

property recurrent(self)

bool: If this module has a hidden state.

property env_spec(self)

Policy environment specification.

Returns

Environment specification.

Return type

garage.EnvSpec

network_output_spec(self)

Network output spec.

Returns

List of key(str) for the network outputs.

Return type

list[str]

build(self, *inputs, name=None)

Build a Network with the given input(s).

* Do not call tf.global_variable_initializers() after building a model as it will reassign random weights to the model. The parameters inside a model will be initialized when calling build(). *

It uses the same, fixed variable scope for all Networks, to ensure parameter sharing. Different Networks must have an unique name.

Parameters
  • inputs (list[tf.Tensor]) – Tensor input(s), recommended to be positional arguments, for example, def build(self, state_input, action_input, name=None).

  • name (str) – Name of the model, which is also the name scope of the model.

Raises

ValueError – When a Network with the same name is already built.

Returns

Output tensors of the model with the given

inputs.

Return type

list[tf.Tensor]

network_input_spec(self)

Network input spec.

Returns

List of key(str) for the network inputs.

Return type

list[str]

property parameters(self)

Parameters of the model.

Returns

Parameters

Return type

np.ndarray

property name(self)

Name (str) of the model.

This is also the variable scope of the model.

Returns

Name of the model.

Return type

str

property input(self)

Default input of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the input of the network.

Returns

Default input of the model.

Return type

tf.Tensor

property output(self)

Default output of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the output of the network.

Returns

Default output of the model.

Return type

tf.Tensor

property inputs(self)

Default inputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the inputs of the network.

Returns

Default inputs of the model.

Return type

list[tf.Tensor]

property outputs(self)

Default outputs of the model.

When the model is built the first time, by default it creates the ‘default’ network. This property creates a reference to the outputs of the network.

Returns

Default outputs of the model.

Return type

list[tf.Tensor]

reset(self, do_resets=None)

Reset the module.

This is effective only to recurrent modules. do_resets is effective only to vectoried modules.

For a vectorized modules, do_resets is an array of boolean indicating which internal states to be reset. The length of do_resets should be equal to the length of inputs.

Parameters

do_resets (numpy.ndarray) – Bool array indicating which states to be reset.

property state_info_specs(self)

State info specification.

Returns

keys and shapes for the information related to the

module’s state when taking an action.

Return type

List[str]

property state_info_keys(self)

State info keys.

Returns

keys for the information related to the module’s state

when taking an input.

Return type

List[str]

terminate(self)

Clean up operation.

get_trainable_vars(self)

Get trainable variables.

Returns

A list of trainable variables in the current

variable scope.

Return type

List[tf.Variable]

get_global_vars(self)

Get global variables.

Returns

A list of global variables in the current

variable scope.

Return type

List[tf.Variable]

get_regularizable_vars(self)

Get all network weight variables in the current scope.

Returns

A list of network weight variables in the

current variable scope.

Return type

List[tf.Variable]

get_params(self)

Get the trainable variables.

Returns

A list of trainable variables in the current

variable scope.

Return type

List[tf.Variable]

get_param_shapes(self)

Get parameter shapes.

Returns

A list of variable shapes.

Return type

List[tuple]

get_param_values(self)

Get param values.

Returns

Values of the parameters evaluated in

the current session

Return type

np.ndarray

set_param_values(self, param_values)

Set param values.

Parameters

param_values (np.ndarray) – A numpy array of parameter values.

flat_to_params(self, flattened_params)

Unflatten tensors according to their respective shapes.

Parameters

flattened_params (np.ndarray) – A numpy array of flattened params.

Returns

A list of parameters reshaped to the

shapes specified.

Return type

List[np.ndarray]