metarl.tf.policies.policy module¶
Base class for policies in TensorFlow.
-
class
Policy(name, env_spec)[source]¶ Bases:
metarl.tf.models.module.ModuleBase class for policies in TensorFlow.
Parameters: - name (str) – Policy name, also the variable scope.
- env_spec (metarl.envs.env_spec.EnvSpec) – Environment specification.
-
action_space¶ Action space.
Returns: The action space of the environment. Return type: akro.Space
-
env_spec¶ Policy environment specification.
Returns: Environment specification. Return type: metarl.EnvSpec
-
get_action(observation)[source]¶ Get action sampled from the policy.
Parameters: observation (np.ndarray) – Observation from the environment. Returns: Action sampled from the policy. Return type: (np.ndarray)
-
get_actions(observations)[source]¶ Get action sampled from the policy.
Parameters: observations (list[np.ndarray]) – Observations from the environment. Returns: Actions sampled from the policy. Return type: (np.ndarray)
-
log_diagnostics(paths)[source]¶ Log extra information per iteration based on the collected paths.
Parameters: paths (dict[numpy.ndarray]) – Sample paths.
-
observation_space¶ Observation space.
Returns: The observation space of the environment. Return type: akro.Space
-
class
StochasticPolicy(name, env_spec)[source]¶ Bases:
metarl.tf.policies.policy.Policy,metarl.tf.models.module.StochasticModuleStochastic Policy.