metarl.np package¶
Reinforcement Learning Algorithms which use NumPy as a numerical backend.
-
paths_to_tensors(paths, max_path_length, baseline_predictions, discount)[source]¶ Return processed sample data based on the collected paths.
Parameters: Returns: - Processed sample data, with key
- observations (numpy.ndarray): Padded array of the observations of
- the environment
- actions (numpy.ndarray): Padded array of the actions fed to the
- the environment
- rewards (numpy.ndarray): Padded array of the acquired rewards
- agent_infos (dict): a dictionary of {stacked tensors or
- dictionary of stacked tensors}
- env_infos (dict): a dictionary of {stacked tensors or
- dictionary of stacked tensors}
- rewards (numpy.ndarray): Padded array of the validity information
Return type: