framework.middleware.functional.enhancer¶
enhancer¶
reward_estimator¶
- ding.framework.middleware.functional.enhancer.reward_estimator(cfg: EasyDict, reward_model: BaseRewardModel) Callable [source]¶
- Overview:
Estimate the reward of train_data using reward_model.
- Arguments:
cfg (
EasyDict
): Config.reward_model (
BaseRewardModel
): Reward model.
her_data_enhancer¶
- ding.framework.middleware.functional.enhancer.her_data_enhancer(cfg: EasyDict, buffer_: Buffer, her_reward_model: HerRewardModel) Callable [source]¶
- Overview:
Fetch a batch of data/episode from buffer_, then use her_reward_model to get HER processed episodes from original episodes.
- Arguments:
cfg (
EasyDict
): Config which should contain the following keys if her_reward_model.episode_size is None: cfg.policy.learn.batch_size.buffer_ (
Buffer
): Buffer to sample data from.her_reward_model (
HerRewardModel
): Hindsight Experience Replay (HER) model which is used to process episodes.