ray.rllib.policy.sample_batch.SampleBatch.split_by_episode#
- SampleBatch.split_by_episode(key: str | None = None) List[SampleBatch] [源代码]#
根据
eps_id
列进行分割,并返回新的批次列表。如果eps_id
不存在,则根据dones
进行分割。- 参数:
key – 如果指定,覆盖默认值并使用键进行分割。
- 返回:
批次列表,每个批次对应一个不同的剧集。
- 抛出:
KeyError – 如果
eps_id
和dones
列不存在。
from ray.rllib.policy.sample_batch import SampleBatch # "eps_id" is present batch = SampleBatch( {"a": [1, 2, 3], "eps_id": [0, 0, 1]}) print(batch.split_by_episode()) # "eps_id" not present, split by "dones" instead batch = SampleBatch( {"a": [1, 2, 3, 4, 5], "dones": [0, 0, 1, 0, 1]}) print(batch.split_by_episode()) # The last episode is appended even if it does not end with done batch = SampleBatch( {"a": [1, 2, 3, 4, 5], "dones": [0, 0, 1, 0, 0]}) print(batch.split_by_episode()) batch = SampleBatch( {"a": [1, 2, 3, 4, 5], "dones": [0, 0, 0, 0, 0]}) print(batch.split_by_episode())
[{"a": [1, 2], "eps_id": [0, 0]}, {"a": [3], "eps_id": [1]}] [{"a": [1, 2, 3], "dones": [0, 0, 1]}, {"a": [4, 5], "dones": [0, 1]}] [{"a": [1, 2, 3], "dones": [0, 0, 1]}, {"a": [4, 5], "dones": [0, 0]}] [{"a": [1, 2, 3, 4, 5], "dones": [0, 0, 0, 0, 0]}]