data.buffer.middleware.priority¶

priority¶

PriorityExperienceReplay¶

class ding.data.buffer.middleware.priority.PriorityExperienceReplay(buffer: Buffer, IS_weight: bool = True, priority_power_factor: float = 0.6, IS_weight_power_factor: float = 0.4, IS_weight_anneal_train_iter: int = 100000)[source]¶

Overview:: The middleware that implements priority experience replay (PER).

__init__(buffer: Buffer, IS_weight: bool = True, priority_power_factor: float = 0.6, IS_weight_power_factor: float = 0.4, IS_weight_anneal_train_iter: int = 100000) → None[source]¶

Arguments:

buffer (Buffer): The buffer to use PER.
IS_weight (bool): Whether use importance sampling or not.
priority_power_factor (float): The factor that adjust the sensitivity between the sampling probability and the priority level.
IS_weight_power_factor (float): The factor that adjust the sensitivity between the sample rarity and sampling probability in importance sampling.
IS_weight_anneal_train_iter (float): The factor that controls the increasing of IS_weight_power_factor during training.