ray.rllib.policy.sample_batch.SampleBatch.split_by_episode#

SampleBatch.split_by_episode(key: str | None = None) → List[SampleBatch][源代码]#

根据 eps_id 列进行分割，并返回新的批次列表。如果 eps_id 不存在，则根据 dones 进行分割。

参数:: key – 如果指定，覆盖默认值并使用键进行分割。
返回:: 批次列表，每个批次对应一个不同的剧集。
抛出:: KeyError – 如果 eps_id 和 dones 列不存在。

from ray.rllib.policy.sample_batch import SampleBatch
# "eps_id" is present
batch = SampleBatch(
    {"a": [1, 2, 3], "eps_id": [0, 0, 1]})
print(batch.split_by_episode())

# "eps_id" not present, split by "dones" instead
batch = SampleBatch(
    {"a": [1, 2, 3, 4, 5], "dones": [0, 0, 1, 0, 1]})
print(batch.split_by_episode())

# The last episode is appended even if it does not end with done
batch = SampleBatch(
    {"a": [1, 2, 3, 4, 5], "dones": [0, 0, 1, 0, 0]})
print(batch.split_by_episode())

batch = SampleBatch(
    {"a": [1, 2, 3, 4, 5], "dones": [0, 0, 0, 0, 0]})
print(batch.split_by_episode())

[{"a": [1, 2], "eps_id": [0, 0]}, {"a": [3], "eps_id": [1]}]
[{"a": [1, 2, 3], "dones": [0, 0, 1]}, {"a": [4, 5], "dones": [0, 1]}]
[{"a": [1, 2, 3], "dones": [0, 0, 1]}, {"a": [4, 5], "dones": [0, 0]}]
[{"a": [1, 2, 3, 4, 5], "dones": [0, 0, 0, 0, 0]}]

ray.rllib.policy.sample_batch.SampleBatch.split_by_episode#

Ray Docs AI - Ask a question