metarl.np.algos.nop module¶
NOP (no optimization performed) policy search algorithm.
-
class
NOP[source]¶ Bases:
metarl.np.algos.rl_algorithm.RLAlgorithmNOP (no optimization performed) policy search algorithm.
-
optimize_policy(paths)[source]¶ Optimize the policy using the samples.
Parameters: paths (list[dict]) – A list of collected paths.
-
train(runner)[source]¶ Obtain samplers and start actual training for each epoch.
Parameters: runner (LocalRunner) – LocalRunner is passed to give algorithm the access to runner.step_epochs(), which provides services such as snapshotting and sampler control.
-