lightrft.datasets.grm_dataset¶

class lightrft.datasets.grm_dataset.GRMDataset(*args: Any, **kwargs: Any)[source]¶

Bases: Dataset

Dataset for Generative Reward Model (GRM) training.

GRMDataset supports multiple data sources through pluggable Data Handlers and covers both understanding tasks (image-to-text, video-to-text) and generation tasks (text-to-image, text-to-video).

Parameters:

dataset_paths (List[str]) – List of dataset file paths or directories, in the format source:path where the handler is determined by the source keyword such as hpdv3, imagegen-cot-reward, or omnirewardbench.
processor (transformers.AutoProcessor) – Multimodal processor used for tokenization and visual processing.
tokenizer (transformers.AutoTokenizer) – Tokenizer used for text tokenization (provides eos_token and pad_token_id attributes).
strategy (Any) – Optional data loading strategy.
max_length (int) – Maximum sequence length for tokenization/truncation.
config (Dict[str, Any]) –
Additional configuration options. Supported keys include: - task_instruction (str): Instruction for the evaluation task. - system_prompt_template (str): Template for the system prompt

with a {prompt} placeholder.
is_training (bool) – Whether the dataset is used for training (returns labels) or evaluation (no labels returned).

Example:

dataset = GRMDataset([
    'imagegen-cot-reward-5k:/data/imagegen-cot-reward-5k/train.json'
], processor=proc, tokenizer=tok, max_length=4096, is_training=True)

collate_fn(batch: List[Tuple]) → Tuple | None[source]¶

Collate a batch of items into a single batch for model processing.

Parameters:: batch (List[Tuple]) – A list of items returned by __getitem__
Returns:: A tuple containing batched input_ids, attention_mask, pixel_values, grid_sizes, labels, and extras.
Return type:: Optional[Tuple]

Example:

batch = dataset.collate_fn([dataset[i] for i in range(4)])