garage.sampler._dtypes

Datatypes used by multiple Samplers or Workers.

class InProgressEpisode(env, initial_observation=None, episode_info=None)[source]

An in-progress episode.

Compared to EpisodeBatch, this datatype does less checking, only contains one episodes, and uses lists instead of numpy arrays to make stepping faster.

Parameters
  • env (Environment) – The environment the trajectory is being collected in.

  • initial_observation (np.ndarray) – The first observation. If None, the environment will be reset to generate this observation.

  • episode_info (dict[str, np.ndarray]) – Info for this episode.

Raises

ValueError – if either initial_observation and episode_info is passed in but not the other. Either both or neither should be passed in.

step(self, action, agent_info)[source]

Step the episode using an action from an agent.

Parameters
  • action (np.ndarray) – The action taken by the agent.

  • agent_info (dict[str, np.ndarray]) – Extra agent information.

Returns

The new observation from the environment.

Return type

np.ndarray

to_batch(self)[source]

Convert this in-progress episode into a EpisodeBatch.

Returns

This episode as a batch.

Return type

EpisodeBatch

Raises

AssertionError – If this episode contains no time steps.

property last_obs(self)

np.ndarray: The last observation in the epside.