ray.rllib.evaluation.rollout_worker.RolloutWorker.sample_with_count#

RolloutWorker.sample_with_count() Tuple[SampleBatch | MultiAgentBatch | Dict[str, Any], int][源代码]#

与 sample() 相同,但将计数作为单独的值返回。

返回:

一批经验(例如,张量)及其收集的批次大小。

import gymnasium as gym
from ray.rllib.evaluation.rollout_worker import RolloutWorker
from ray.rllib.algorithms.ppo.ppo_tf_policy import PPOTF1Policy
worker = RolloutWorker(
  env_creator=lambda _: gym.make("CartPole-v1"),
  default_policy_class=PPOTFPolicy)
print(worker.sample_with_count())
(SampleBatch({"obs": [...], "action": [...], ...}), 3)