LightZero

Tutorials

  • Installation and Quick Start Guide
  • How to Customize Your Algorithms in LightZero?
  • How to Customize Your Environments in LightZero?
  • How to Set Configuration Files in LightZero
  • LightZero’s Logging and Monitoring System

API Documentation

  • Agent
  • Config
  • Entry
  • Envs
  • MCTS
    • Buffer
    • Tree Search
      • MuZeroMCTSCtree
        • MuZeroMCTSCtree.__init__()
        • MuZeroMCTSCtree.config
        • MuZeroMCTSCtree.default_config()
        • MuZeroMCTSCtree.roots()
        • MuZeroMCTSCtree.search()
        • MuZeroMCTSCtree.search_with_reuse()
      • EfficientZeroMCTSCtree
        • EfficientZeroMCTSCtree.__init__()
        • EfficientZeroMCTSCtree.config
        • EfficientZeroMCTSCtree.default_config()
        • EfficientZeroMCTSCtree.roots()
        • EfficientZeroMCTSCtree.search()
        • EfficientZeroMCTSCtree.search_with_reuse()
      • GumbelMuZeroMCTSCtree
        • GumbelMuZeroMCTSCtree.__init__()
        • GumbelMuZeroMCTSCtree.config
        • GumbelMuZeroMCTSCtree.default_config()
        • GumbelMuZeroMCTSCtree.roots()
        • GumbelMuZeroMCTSCtree.search()
  • Model
  • Policy
  • Worker
LightZero
  • <no title>
  • MCTS
  • Tree Search
  • View page source

Tree Search

class lzero.mcts.tree_search.mcts_ctree.MuZeroMCTSCtree(cfg: EasyDict = None)[source]

Bases: object

Overview:

MCTSCtree for MuZero. The core batch_traverse and batch_backpropagate function is implemented in C++.

Interfaces:

__init__, roots, search

__init__(cfg: EasyDict = None) → None[source]
Overview:

Use the default configuration mechanism. If a user passes in a cfg with a key that matches an existing key in the default configuration, the user-provided value will override the default configuration. Otherwise, the default configuration will be used.

config = {'env_type': 'not_board_games', 'pb_c_base': 19652, 'pb_c_init': 1.25, 'root_dirichlet_alpha': 0.3, 'root_noise_weight': 0.25, 'value_delta_max': 0.01}
classmethod default_config() → EasyDict[source]
classmethod roots(active_collect_env_num: int, legal_actions: List[Any]) → mz_ctree[source]
Overview:

The initialization of CRoots with root num and legal action lists.

Parameters:
  • root_num (-) – the number of the current root.

  • legal_action_list (-) – the vector of the legal action of this root.

search(roots: Any, model: Module, latent_state_roots: List[Any], to_play_batch: int | List[Any]) → None[source]
Overview:

Do MCTS for the roots (a batch of root nodes in parallel). Parallel in model inference. Use the cpp ctree.

Parameters:
  • roots (-) – a batch of expanded root nodes

  • latent_state_roots (-) – the hidden states of the roots

  • to_play_batch (-) – the to_play_batch list used in in self-play-mode board games

search_with_reuse(roots: Any, model: Module, latent_state_roots: List[Any], to_play_batch: int | List[Any], true_action_list=None, reuse_value_list=None) → None[source]
Overview:

Perform Monte Carlo Tree Search (MCTS) for the root nodes in parallel. Utilizes the cpp ctree for efficiency. Please refer to https://arxiv.org/abs/2404.16364 for more details.

Parameters:
  • roots (-) – A batch of expanded root nodes.

  • model (-) – The neural network model.

  • latent_state_roots (-) – The hidden states of the root nodes.

  • to_play_batch (-) – The list or batch indicator for players in self-play mode.

  • true_action_list (-) – A list of true actions for reuse.

  • reuse_value_list (-) – A list of values for reuse.

class lzero.mcts.tree_search.mcts_ctree.EfficientZeroMCTSCtree(cfg: EasyDict = None)[source]

Bases: object

Overview:

The C++ implementation of MCTS (batch format) for EfficientZero. It completes the roots``and ``search methods by calling functions in module ctree_efficientzero, which are implemented in C++.

Interfaces:

__init__, roots, search

..note::

The benefit of searching for a batch of nodes at the same time is that it can be parallelized during model inference, thus saving time.

__init__(cfg: EasyDict = None) → None[source]
Overview:

Use the default configuration mechanism. If a user passes in a cfg with a key that matches an existing key in the default configuration, the user-provided value will override the default configuration. Otherwise, the default configuration will be used.

Parameters:

cfg (-) – The configuration passed in by the user.

config = {'pb_c_base': 19652, 'pb_c_init': 1.25, 'root_dirichlet_alpha': 0.3, 'root_noise_weight': 0.25, 'value_delta_max': 0.01}
classmethod default_config() → EasyDict[source]
Overview:

A class method that returns a default configuration in the form of an EasyDict object.

Returns:

The dict of the default configuration.

Return type:

  • cfg (EasyDict)

classmethod roots(active_collect_env_num: int, legal_actions: List[Any]) → ez_ctree.Roots[source]
Overview:

Initializes a batch of roots to search parallelly later.

Parameters:
  • root_num (-) – the number of the roots in a batch.

  • legal_action_list (-) – the vector of the legal actions for the roots.

..note::

The initialization is achieved by the Roots class from the ctree_efficientzero module.

search(roots: Any, model: Module, latent_state_roots: List[Any], reward_hidden_state_roots: List[Any], to_play_batch: int | List[Any]) → None[source]
Overview:

Do MCTS for a batch of roots. Parallel in model inference. Use C++ to implement the tree search.

Parameters:
  • roots (-) – a batch of expanded root nodes.

  • latent_state_roots (-) – the hidden states of the roots.

  • reward_hidden_state_roots (-) – the value prefix hidden states in LSTM of the roots.

  • model (-) – The model used for inference.

  • to_play (-) – the to_play list used in in self-play-mode board games.

Note

The core functions batch_traverse and batch_backpropagate are implemented in C++.

search_with_reuse(roots: Any, model: Module, latent_state_roots: List[Any], reward_hidden_state_roots: List[Any], to_play_batch: int | List[Any], true_action_list=None, reuse_value_list=None) → None[source]

Perform Monte Carlo Tree Search (MCTS) for the root nodes in parallel, utilizing model inference in parallel. This method uses the cpp ctree for efficiency. Please refer to https://arxiv.org/abs/2404.16364 for more details.

Parameters:
  • roots (-) – A batch of expanded root nodes.

  • model (-) – The model to use for inference.

  • latent_state_roots (-) – The hidden states of the root nodes.

  • reward_hidden_state_roots (-) – The value prefix hidden states in the LSTM of the roots.

  • to_play_batch (-) – The to_play_batch list used in self-play-mode board games.

  • true_action_list (-) – List of true actions for reuse.

  • reuse_value_list (-) – List of values for reuse.

Returns:

None

class lzero.mcts.tree_search.mcts_ctree.GumbelMuZeroMCTSCtree(cfg: EasyDict = None)[source]

Bases: object

Overview:

The C++ implementation of MCTS (batch format) for Gumbel MuZero. It completes the roots and search methods by calling functions in module ctree_gumbel_muzero, which are implemented in C++.

Interfaces:

__init__, roots, search

..note::

The benefit of searching for a batch of nodes at the same time is that it can be parallelized during model inference, thus saving time.

__init__(cfg: EasyDict = None) → None[source]
Overview:

Use the default configuration mechanism. If a user passes in a cfg with a key that matches an existing key in the default configuration, the user-provided value will override the default configuration. Otherwise, the default configuration will be used.

Parameters:

cfg (-) – The configuration passed in by the user.

config = {'num_simulations': 50, 'root_dirichlet_alpha': 0.3, 'root_noise_weight': 0.25, 'value_delta_max': 0.01}
classmethod default_config() → EasyDict[source]
Overview:

A class method that returns a default configuration in the form of an EasyDict object.

Returns:

The dict of the default configuration.

Return type:

  • cfg (EasyDict)

classmethod roots(active_collect_env_num: int, legal_actions: List[Any]) → gmz_ctree[source]
Overview:

Initializes a batch of roots to search parallelly later.

Parameters:
  • root_num (-) – the number of the roots in a batch.

  • legal_action_list (-) – the vector of the legal actions for the roots.

..note::

The initialization is achieved by the Roots class from the ctree_gumbel_muzero module.

search(roots: Any, model: Module, latent_state_roots: List[Any], to_play_batch: int | List[Any]) → None[source]
Overview:

Do MCTS for a batch of roots. Parallel in model inference. Use C++ to implement the tree search.

Parameters:
  • roots (-) – a batch of expanded root nodes.

  • latent_state_roots (-) – the hidden states of the roots.

  • model (-) – The model used for inference.

  • to_play (-) – the to_play list used in in self-play-mode board games.

Note

The core functions batch_traverse and batch_backpropagate are implemented in C++.

Previous Next

© Copyright 2023, OpenDILab Contributors.

Built with Sphinx using a theme provided by Read the Docs.