Shortcuts

lightrft.trainer.replay_buffer_vl

class lightrft.trainer.replay_buffer_vl.NaiveReplayBufferVL(sample_batch_size: int, limit: int = 0, cpu_offload: bool = True, packing_samples: bool = False)[source]

Bases: ABC

Naive replay buffer class for Vision-Language models. It stores experience samples.

Parameters:
  • sample_batch_size (int) – Batch size when sampling (train_micro_batch_size).

  • limit (int) – Limit of number of experience samples. A number <= 0 means unlimited, defaults to 0.

  • cpu_offload (bool) – Whether to offload experience to CPU when sampling, defaults to True.

  • packing_samples (bool) – Whether to use packed samples format, defaults to False.

append(experience: ExperienceVL) None

Append experience to the replay buffer.

Parameters:

experience (ExperienceVL) – Experience batch to append.

clear() None[source]

Clear all items from the replay buffer.

collate_fn(batch) ExperienceVL[source]

Collate function for DataLoader.

Parameters:

batch (List[BufferItemVL]) – Batch of buffer items.

Returns:

Batched experience.

Return type:

ExperienceVL

normalize(attribute: str, strategy) None[source]

Normalize a specified attribute across all items in the buffer.

This method computes the mean and standard deviation of the specified attribute across all items and normalizes them. Currently only supports “advantages”.

Parameters:
  • attribute (str) – Name of the attribute to normalize (currently only “advantages” is supported).

  • strategy (Strategy) – Distributed training strategy for all_reduce operations.

sample() ExperienceVL

Sample a batch of experiences from the replay buffer.

Returns:

Batch of sampled experiences.

Return type:

ExperienceVL