Shortcuts

framework.middleware.functional.enhancer

enhancer

reward_estimator

ding.framework.middleware.functional.enhancer.reward_estimator(cfg: EasyDict, reward_model: BaseRewardModel) Callable[source]
Overview:

Estimate the reward of train_data using reward_model.

Arguments:
  • cfg (EasyDict): Config.

  • reward_model (BaseRewardModel): Reward model.

her_data_enhancer

ding.framework.middleware.functional.enhancer.her_data_enhancer(cfg: EasyDict, buffer_: Buffer, her_reward_model: HerRewardModel) Callable[source]
Overview:

Fetch a batch of data/episode from buffer_, then use her_reward_model to get HER processed episodes from original episodes.

Arguments:
  • cfg (EasyDict): Config which should contain the following keys if her_reward_model.episode_size is None: cfg.policy.learn.batch_size.

  • buffer_ (Buffer): Buffer to sample data from.

  • her_reward_model (HerRewardModel): Hindsight Experience Replay (HER) model which is used to process episodes.