lightrft.utils.remote_rm_utils¶
- lightrft.utils.remote_rm_utils.remote_rm_fn(api_url: str, queries: List[str], prompts: List[str], labels: List[Any] | None = None, references: List[str] | None = None, raw_images: List[Image | List[Image] | None] | None = None, score_key: str = 'rewards') torch.Tensor[source]¶
Remote reward model API function for scoring text and image inputs.
This function prepares data and sends requests to a remote reward model API, supporting both text-only and multimodal (text + image) scoring scenarios.
- Parameters:
api_url (str) – Reward model API endpoint URL
queries (List[str]) – List of query strings with response templates
prompts (List[str]) – List of prompt strings for context
labels (Optional[List[Any]]) – Optional list of labels for supervised scoring (currently unused)
references (Optional[List[str]]) – Optional list of reference responses for comparison scoring
raw_images (Optional[List[Optional[Union[Image.Image, List[Image.Image]]]]]) – Optional list of PIL Image objects or lists of PIL Image objects for multimodal scoring. Each element can be None, a single image, or a list of images.
score_key (str) – Key in the API response that contains the reward scores
- Returns:
Tensor of reward scores for all input samples
- Return type:
torch.Tensor
- Raises:
Exception – When API requests fail after maximum retry attempts
Example:
# Text-only scoring scores = remote_rm_fn( api_url="http://localhost:8000/score", queries=["What is 2+2?"], prompts=["Calculate the following:"], score_key="rewards" ) # Multimodal scoring with images scores = remote_rm_fn( api_url="http://localhost:8000/score", queries=["Describe this image"], prompts=["Please analyze the image:"], raw_images=[Image.open("image.jpg")], score_key="rewards" )
- lightrft.utils.remote_rm_utils.request_api_wrapper(url: str, data: Dict[str, Any], score_key: str = 'rewards', try_max_times: int = 5) float | List[float][source]¶
Synchronous request API wrapper for reward model scoring.
This function makes HTTP POST requests to a reward model API endpoint and handles retries with exponential backoff for failed requests.
- Parameters:
url (str) – The API endpoint URL to send requests to
data (Dict[str, Any]) – The request payload data as a dictionary
score_key (str) – The key in the response JSON that contains the reward scores
try_max_times (int) – Maximum number of retry attempts for failed requests
- Returns:
Reward scores extracted from the API response, either as a single float or a list of floats depending on the API response structure
- Return type:
Union[float, List[float]]
- Raises:
Exception – When all retry attempts fail after the maximum number of tries
Example:
score = request_api_wrapper( url="http://localhost:8000/score", data={"text": "Hello world"}, score_key="rewards", try_max_times=5 )