grl.datasets¶
QGPOD4RLDataset¶
- class grl.datasets.QGPOD4RLDataset(env_id)[source]¶
- Overview:
Dataset for QGPO algorithm. The training of QGPO algorithm is based on contrastive energy prediction, which needs true action and fake action. The true action is sampled from the dataset, and the fake action is sampled from the action support generated by the behaviour policy.
- Interface:
__init__
,__getitem__
,__len__
.
QGPODataset¶
- class grl.datasets.QGPODataset[source]¶
- Overview:
Dataset for QGPO algorithm. The training of QGPO algorithm is based on contrastive energy prediction, which needs true action and fake action. The true action is sampled from the dataset, and the fake action is sampled from the action support generated by the behaviour policy.
- Interface:
__init__
,__getitem__
,__len__
.
GPD4RLDataset¶
GPDataset¶
- class grl.datasets.GPDataset[source]¶
- Overview:
Dataset for Generative Policy algorithm. The training of Generative Policy algorithm sometimes needs true action and fake action. The true action is sampled from the dataset, and the fake action is sampled from the behaviour policy, which is data augmentation.
- Interface:
__init__
,__getitem__
,__len__
.