lightrft.trainer.replay_buffer_vl¶
- class lightrft.trainer.replay_buffer_vl.NaiveReplayBufferVL(sample_batch_size: int, limit: int = 0, cpu_offload: bool = True, packing_samples: bool = False)[source]¶
Bases:
ABCNaive replay buffer class for Vision-Language models. It stores experience samples.
- Parameters:
sample_batch_size (int) – Batch size when sampling (train_micro_batch_size).
limit (int) – Limit of number of experience samples. A number <= 0 means unlimited, defaults to 0.
cpu_offload (bool) – Whether to offload experience to CPU when sampling, defaults to True.
packing_samples (bool) – Whether to use packed samples format, defaults to False.
- append(experience: ExperienceVL) None¶
Append experience to the replay buffer.
- Parameters:
experience (ExperienceVL) – Experience batch to append.
- collate_fn(batch) ExperienceVL[source]¶
Collate function for DataLoader.
- Parameters:
batch (List[BufferItemVL]) – Batch of buffer items.
- Returns:
Batched experience.
- Return type:
- normalize(attribute: str, strategy) None[source]¶
Normalize a specified attribute across all items in the buffer.
This method computes the mean and standard deviation of the specified attribute across all items and normalizes them. Currently only supports “advantages”.
- Parameters:
attribute (str) – Name of the attribute to normalize (currently only “advantages” is supported).
strategy (Strategy) – Distributed training strategy for all_reduce operations.
- sample() ExperienceVL¶
Sample a batch of experiences from the replay buffer.
- Returns:
Batch of sampled experiences.
- Return type: