Multi-Task Soft Actor-Critic (MT-SAC)¶

Action Space	Continuous
Framework(s)	PyTorch¶
API Reference	garage.torch.algos.MTSAC
Code	garage/torch/algos/mtsac.py
Examples	mtsac_metaworld_ml1_pick_place, mtsac_metaworld_mt10, mtsac_metaworld_mt50

The Multi-Task Soft Actor-Critic (MT-SAC) algorithm is the same as the Soft Actor Critic (SAC) algorithm, except for a small change called “disentangled alphas”. Alpha is the entropy coefficient that is used to control exploration of the agent/policy. Disentangling alphas refers to having a separate alpha coefficients for every task learned by the policy. The alphas are accessed by using a one-hot encoding of an id that is assigned to each task.

Default Parameters¶

initial_log_entropy=0.,
discount=0.99,
buffer_batch_size=64,
min_buffer_size=int(1e4),
target_update_tau=5e-3,
policy_lr=3e-4,
qf_lr=3e-4,
reward_scale=1.0,
optimizer=torch.optim.Adam,
steps_per_epoch=1,
num_evaluation_episodes=5,
use_deterministic_evaluation=True,

Multi-Task Soft Actor-Critic (MT-SAC)¶

Default Parameters¶

Examples¶

mtsac_metaworld_ml1_pick_place¶

mtsac_metaworld_mt10¶

mtsac_metaworld_mt50¶