ray.rllib.utils.replay_buffers.replay_buffer.ReplayBuffer.sample#

ReplayBuffer.sample(num_items: int | None = None, **kwargs) SampleBatch | MultiAgentBatch | Dict[str, Any] | None[源代码]#

从该缓冲区中采样 num_items 个项目。

这些项目取决于缓冲区的 storage_unit。结果中的样本可能会重复。

采样结果示例:

1) If storage unit ‘timesteps’ has been chosen and batches of size 5 have been added, sample(5) will yield a concatenated batch of 15 timesteps.

2) If storage unit ‘sequences’ has been chosen and sequences of different lengths have been added, sample(5) will yield a concatenated batch with a number of timesteps equal to the sum of timesteps in the 5 sampled sequences.

3) If storage unit ‘episodes’ has been chosen and episodes of different lengths have been added, sample(5) will yield a concatenated batch with a number of timesteps equal to the sum of timesteps in the 5 sampled episodes.

参数:
  • num_items – 从此缓冲区中采样的项目数量。

  • **kwargs – 向前兼容的关键字参数。

返回:

连接的项目批次。