Tree Search
- class lzero.mcts.tree_search.mcts_ctree.MuZeroMCTSCtree(cfg: EasyDict = None)[source]
Bases:
object
- Overview:
MCTSCtree for MuZero. The core
batch_traverse
andbatch_backpropagate
function is implemented in C++.- Interfaces:
__init__, roots, search
- __init__(cfg: EasyDict = None) None [source]
- Overview:
Use the default configuration mechanism. If a user passes in a cfg with a key that matches an existing key in the default configuration, the user-provided value will override the default configuration. Otherwise, the default configuration will be used.
- config = {'env_type': 'not_board_games', 'pb_c_base': 19652, 'pb_c_init': 1.25, 'root_dirichlet_alpha': 0.3, 'root_noise_weight': 0.25, 'value_delta_max': 0.01}
- classmethod roots(active_collect_env_num: int, legal_actions: List[Any]) mz_ctree [source]
- Overview:
The initialization of CRoots with root num and legal action lists.
- Parameters:
root_num (-) – the number of the current root.
legal_action_list (-) – the vector of the legal action of this root.
- search(roots: Any, model: Module, latent_state_roots: List[Any], to_play_batch: int | List[Any]) None [source]
- Overview:
Do MCTS for the roots (a batch of root nodes in parallel). Parallel in model inference. Use the cpp ctree.
- Parameters:
roots (-) – a batch of expanded root nodes
latent_state_roots (-) – the hidden states of the roots
to_play_batch (-) – the to_play_batch list used in in self-play-mode board games
- search_with_reuse(roots: Any, model: Module, latent_state_roots: List[Any], to_play_batch: int | List[Any], true_action_list=None, reuse_value_list=None) None [source]
- Overview:
Perform Monte Carlo Tree Search (MCTS) for the root nodes in parallel. Utilizes the cpp ctree for efficiency. Please refer to https://arxiv.org/abs/2404.16364 for more details.
- Parameters:
roots (-) – A batch of expanded root nodes.
model (-) – The neural network model.
latent_state_roots (-) – The hidden states of the root nodes.
to_play_batch (-) – The list or batch indicator for players in self-play mode.
true_action_list (-) – A list of true actions for reuse.
reuse_value_list (-) – A list of values for reuse.
- class lzero.mcts.tree_search.mcts_ctree.EfficientZeroMCTSCtree(cfg: EasyDict = None)[source]
Bases:
object
- Overview:
The C++ implementation of MCTS (batch format) for EfficientZero. It completes the
roots``and ``search
methods by calling functions in modulectree_efficientzero
, which are implemented in C++.- Interfaces:
__init__
,roots
,search
- ..note::
The benefit of searching for a batch of nodes at the same time is that it can be parallelized during model inference, thus saving time.
- __init__(cfg: EasyDict = None) None [source]
- Overview:
Use the default configuration mechanism. If a user passes in a cfg with a key that matches an existing key in the default configuration, the user-provided value will override the default configuration. Otherwise, the default configuration will be used.
- Parameters:
cfg (-) – The configuration passed in by the user.
- config = {'pb_c_base': 19652, 'pb_c_init': 1.25, 'root_dirichlet_alpha': 0.3, 'root_noise_weight': 0.25, 'value_delta_max': 0.01}
- classmethod default_config() EasyDict [source]
- Overview:
A class method that returns a default configuration in the form of an EasyDict object.
- Returns:
The dict of the default configuration.
- Return type:
cfg (
EasyDict
)
- classmethod roots(active_collect_env_num: int, legal_actions: List[Any]) ez_ctree.Roots [source]
- Overview:
Initializes a batch of roots to search parallelly later.
- Parameters:
root_num (-) – the number of the roots in a batch.
legal_action_list (-) – the vector of the legal actions for the roots.
- ..note::
The initialization is achieved by the
Roots
class from thectree_efficientzero
module.
- search(roots: Any, model: Module, latent_state_roots: List[Any], reward_hidden_state_roots: List[Any], to_play_batch: int | List[Any]) None [source]
- Overview:
Do MCTS for a batch of roots. Parallel in model inference. Use C++ to implement the tree search.
- Parameters:
roots (-) – a batch of expanded root nodes.
latent_state_roots (-) – the hidden states of the roots.
reward_hidden_state_roots (-) – the value prefix hidden states in LSTM of the roots.
model (-) – The model used for inference.
to_play (-) – the to_play list used in in self-play-mode board games.
Note
The core functions
batch_traverse
andbatch_backpropagate
are implemented in C++.
- search_with_reuse(roots: Any, model: Module, latent_state_roots: List[Any], reward_hidden_state_roots: List[Any], to_play_batch: int | List[Any], true_action_list=None, reuse_value_list=None) None [source]
Perform Monte Carlo Tree Search (MCTS) for the root nodes in parallel, utilizing model inference in parallel. This method uses the cpp ctree for efficiency. Please refer to https://arxiv.org/abs/2404.16364 for more details.
- Parameters:
roots (-) – A batch of expanded root nodes.
model (-) – The model to use for inference.
latent_state_roots (-) – The hidden states of the root nodes.
reward_hidden_state_roots (-) – The value prefix hidden states in the LSTM of the roots.
to_play_batch (-) – The to_play_batch list used in self-play-mode board games.
true_action_list (-) – List of true actions for reuse.
reuse_value_list (-) – List of values for reuse.
- Returns:
None
- class lzero.mcts.tree_search.mcts_ctree.GumbelMuZeroMCTSCtree(cfg: EasyDict = None)[source]
Bases:
object
- Overview:
The C++ implementation of MCTS (batch format) for Gumbel MuZero. It completes the
roots
andsearch
methods by calling functions in modulectree_gumbel_muzero
, which are implemented in C++.- Interfaces:
__init__
,roots
,search
- ..note::
The benefit of searching for a batch of nodes at the same time is that it can be parallelized during model inference, thus saving time.
- __init__(cfg: EasyDict = None) None [source]
- Overview:
Use the default configuration mechanism. If a user passes in a cfg with a key that matches an existing key in the default configuration, the user-provided value will override the default configuration. Otherwise, the default configuration will be used.
- Parameters:
cfg (-) – The configuration passed in by the user.
- config = {'num_simulations': 50, 'root_dirichlet_alpha': 0.3, 'root_noise_weight': 0.25, 'value_delta_max': 0.01}
- classmethod default_config() EasyDict [source]
- Overview:
A class method that returns a default configuration in the form of an EasyDict object.
- Returns:
The dict of the default configuration.
- Return type:
cfg (
EasyDict
)
- classmethod roots(active_collect_env_num: int, legal_actions: List[Any]) gmz_ctree [source]
- Overview:
Initializes a batch of roots to search parallelly later.
- Parameters:
root_num (-) – the number of the roots in a batch.
legal_action_list (-) – the vector of the legal actions for the roots.
- ..note::
The initialization is achieved by the
Roots
class from thectree_gumbel_muzero
module.
- search(roots: Any, model: Module, latent_state_roots: List[Any], to_play_batch: int | List[Any]) None [source]
- Overview:
Do MCTS for a batch of roots. Parallel in model inference. Use C++ to implement the tree search.
- Parameters:
roots (-) – a batch of expanded root nodes.
latent_state_roots (-) – the hidden states of the roots.
model (-) – The model used for inference.
to_play (-) – the to_play list used in in self-play-mode board games.
Note
The core functions
batch_traverse
andbatch_backpropagate
are implemented in C++.