league.payoff¶
shared_payoff¶
BattleRecordDict¶
BattleSharedPayoff¶
- class ding.league.shared_payoff.BattleSharedPayoff(cfg: EasyDict)[source]¶
- Overview:
Payoff data structure to record historical match result, this payoff is shared among all the players. Use LockContext to ensure thread safe, since all players from all threads can access and modify it.
- Interface:
__getitem__, add_player, update, get_key
- Property:
players
- __getitem__(players: tuple) → ndarray[source]¶
- Overview:
Get win rates between home players and away players one by one
- Arguments:
players (
tuple
): A tuple of (home, away), each one is a player or a player list.
- Returns:
win_rates (
np.ndarray
): Win rate (squeezed, see Shape for more details) between each player from home and each player from away.
- Shape:
win_rates: Assume there are m home players and n away players.(m,n > 0)
m != 1 and n != 1: shape is (m, n)
m == 1: shape is (n)
n == 1: shape is (m)
- add_player(player: Player) → None[source]¶
- Overview:
Add a player to the shared payoff.
- Arguments:
player (
Player
): The player to be added. Usually is a new one to the league as well.
- get_key(home: str, away: str) → Tuple[str, bool][source]¶
- Overview:
Join home player id and away player id in alphabetival order.
- Arguments:
home (
str
): Home player idaway (
str
): Away player id
- Returns:
key (
str
): Tow ids sorted in alphabetical order, and joined by ‘-‘.reverse (
bool
): Whether the two player ids are reordered.
- update(job_info: dict) → bool[source]¶
- Overview:
Update payoff with job_info when a job is to be finished. If update succeeds, return True; If raises an exception when updating, resolve it and return False.
- Arguments:
job_info (
dict
): A dict containing job result information.
- Returns:
result (
bool
): Whether update is successful.
Note
job_info has at least 5 keys [‘launch_player’, ‘player_id’, ‘env_num’, ‘episode_num’, ‘result’]. Key
player_id
‘s value is a tuple of (home_id, away_id). Keyresult
‘s value is a two-layer list with the length of (episode_num, env_num).
create_payoff¶
- Overview:
Given the key (payoff type), now supports keys [‘solo’, ‘battle’], create a new payoff instance if in payoff_mapping’s values, or raise an KeyError.
- Arguments:
cfg (
EasyDict
): payoff config containing at least one key ‘type’
- Returns:
payoff (
BattleSharedPayoff
orSoloSharedPayoff
): the created new payoff, should be an instance of one of payoff_mapping’s values