Register for Ray Summit 2024 with keynotes from Mira Murati, Marc Andreessen, and Anastasis Germanidis.

ray.rllib.evaluation.rollout_worker.RolloutWorker.sample_and_learn#

RolloutWorker.sample_and_learn(expected_batch_size: int, num_sgd_iter: int, sgd_minibatch_size: str, standardize_fields: List[str]) → Tuple[dict, int][源代码]#

采样和批处理并从中学习。

这通常与分布式 allreduce 结合使用。

参数:

expected_batch_size – 期望学习的样本数量。
num_sgd_iter – SGD 迭代次数。
sgd_minibatch_size – SGD 小批量大小。
standardize_fields – 需要规范化的样本字段列表。

返回:

一个元组，包含从策略的 learn_on_batch() 返回的额外元数据字典和已学习的样本数量。