Tree Search

class lzero.mcts.tree_search.mcts_ctree.MuZeroMCTSCtree(cfg: EasyDict = None)[source]

Bases: object

Overview:: MCTSCtree for MuZero. The core batch_traverse and batch_backpropagate function is implemented in C++.
Interfaces:: __init__, roots, search

__init__(cfg: EasyDict = None) → None[source]

Overview:: Use the default configuration mechanism. If a user passes in a cfg with a key that matches an existing key in the default configuration, the user-provided value will override the default configuration. Otherwise, the default configuration will be used.

config = {'env_type': 'not_board_games', 'pb_c_base': 19652, 'pb_c_init': 1.25, 'root_dirichlet_alpha': 0.3, 'root_noise_weight': 0.25, 'value_delta_max': 0.01}

classmethod default_config() → EasyDict[source]

classmethod roots(active_collect_env_num: int, legal_actions: List[Any]) → mz_ctree[source]

Overview:: The initialization of CRoots with root num and legal action lists.

Parameters:

root_num (-) – the number of the current root.
legal_action_list (-) – the vector of the legal action of this root.

search(roots: Any, model: Module, latent_state_roots: List[Any], to_play_batch: int | List[Any]) → None[source]

Overview:: Do MCTS for the roots (a batch of root nodes in parallel). Parallel in model inference. Use the cpp ctree.

Parameters:

roots (-) – a batch of expanded root nodes
latent_state_roots (-) – the hidden states of the roots
to_play_batch (-) – the to_play_batch list used in in self-play-mode board games

search_with_reuse(roots: Any, model: Module, latent_state_roots: List[Any], to_play_batch: int | List[Any], true_action_list=None, reuse_value_list=None) → None[source]

Overview:: Perform Monte Carlo Tree Search (MCTS) for the root nodes in parallel. Utilizes the cpp ctree for efficiency. Please refer to https://arxiv.org/abs/2404.16364 for more details.

Parameters:

roots (-) – A batch of expanded root nodes.
model (-) – The neural network model.
latent_state_roots (-) – The hidden states of the root nodes.
to_play_batch (-) – The list or batch indicator for players in self-play mode.
true_action_list (-) – A list of true actions for reuse.
reuse_value_list (-) – A list of values for reuse.

class lzero.mcts.tree_search.mcts_ctree.EfficientZeroMCTSCtree(cfg: EasyDict = None)[source]

Bases: object

Overview:: The C++ implementation of MCTS (batch format) for EfficientZero. It completes the roots``and ``search methods by calling functions in module ctree_efficientzero, which are implemented in C++.
Interfaces:: __init__, roots, search
..note::: The benefit of searching for a batch of nodes at the same time is that it can be parallelized during model inference, thus saving time.

__init__(cfg: EasyDict = None) → None[source]

Overview:: Use the default configuration mechanism. If a user passes in a cfg with a key that matches an existing key in the default configuration, the user-provided value will override the default configuration. Otherwise, the default configuration will be used.

Parameters:: cfg (-) – The configuration passed in by the user.

config = {'pb_c_base': 19652, 'pb_c_init': 1.25, 'root_dirichlet_alpha': 0.3, 'root_noise_weight': 0.25, 'value_delta_max': 0.01}

classmethod default_config() → EasyDict[source]

Overview:: A class method that returns a default configuration in the form of an EasyDict object.

Returns:

The dict of the default configuration.

Return type:

cfg (EasyDict)

classmethod roots(active_collect_env_num: int, legal_actions: List[Any]) → ez_ctree.Roots[source]

Overview:: Initializes a batch of roots to search parallelly later.

Parameters:

root_num (-) – the number of the roots in a batch.
legal_action_list (-) – the vector of the legal actions for the roots.

..note::: The initialization is achieved by the Roots class from the ctree_efficientzero module.

search(roots: Any, model: Module, latent_state_roots: List[Any], reward_hidden_state_roots: List[Any], to_play_batch: int | List[Any]) → None[source]

Overview:: Do MCTS for a batch of roots. Parallel in model inference. Use C++ to implement the tree search.

Parameters:

roots (-) – a batch of expanded root nodes.
latent_state_roots (-) – the hidden states of the roots.
reward_hidden_state_roots (-) – the value prefix hidden states in LSTM of the roots.
model (-) – The model used for inference.
to_play (-) – the to_play list used in in self-play-mode board games.

Note

The core functions batch_traverse and batch_backpropagate are implemented in C++.

search_with_reuse(roots: Any, model: Module, latent_state_roots: List[Any], reward_hidden_state_roots: List[Any], to_play_batch: int | List[Any], true_action_list=None, reuse_value_list=None) → None[source]

Perform Monte Carlo Tree Search (MCTS) for the root nodes in parallel, utilizing model inference in parallel. This method uses the cpp ctree for efficiency. Please refer to https://arxiv.org/abs/2404.16364 for more details.

Parameters:

roots (-) – A batch of expanded root nodes.
model (-) – The model to use for inference.
latent_state_roots (-) – The hidden states of the root nodes.
reward_hidden_state_roots (-) – The value prefix hidden states in the LSTM of the roots.
to_play_batch (-) – The to_play_batch list used in self-play-mode board games.
true_action_list (-) – List of true actions for reuse.
reuse_value_list (-) – List of values for reuse.

Returns:

None

class lzero.mcts.tree_search.mcts_ctree.GumbelMuZeroMCTSCtree(cfg: EasyDict = None)[source]

Bases: object

Overview:: The C++ implementation of MCTS (batch format) for Gumbel MuZero. It completes the roots and search methods by calling functions in module ctree_gumbel_muzero, which are implemented in C++.
Interfaces:: __init__, roots, search
..note::: The benefit of searching for a batch of nodes at the same time is that it can be parallelized during model inference, thus saving time.

__init__(cfg: EasyDict = None) → None[source]

Overview:: Use the default configuration mechanism. If a user passes in a cfg with a key that matches an existing key in the default configuration, the user-provided value will override the default configuration. Otherwise, the default configuration will be used.

Parameters:: cfg (-) – The configuration passed in by the user.

config = {'num_simulations': 50, 'root_dirichlet_alpha': 0.3, 'root_noise_weight': 0.25, 'value_delta_max': 0.01}

classmethod default_config() → EasyDict[source]

Overview:: A class method that returns a default configuration in the form of an EasyDict object.

Returns:

The dict of the default configuration.

Return type:

cfg (EasyDict)

classmethod roots(active_collect_env_num: int, legal_actions: List[Any]) → gmz_ctree[source]

Overview:: Initializes a batch of roots to search parallelly later.

Parameters:

root_num (-) – the number of the roots in a batch.
legal_action_list (-) – the vector of the legal actions for the roots.

..note::: The initialization is achieved by the Roots class from the ctree_gumbel_muzero module.

search(roots: Any, model: Module, latent_state_roots: List[Any], to_play_batch: int | List[Any]) → None[source]

Overview:: Do MCTS for a batch of roots. Parallel in model inference. Use C++ to implement the tree search.

Parameters:

roots (-) – a batch of expanded root nodes.
latent_state_roots (-) – the hidden states of the roots.
model (-) – The model used for inference.
to_play (-) – the to_play list used in in self-play-mode board games.

Note

The core functions batch_traverse and batch_backpropagate are implemented in C++.