强化学习算法分类
~~~~~~~~~~~~~~~~~~~~~~~


.. toctree::
    :maxdepth: 3

    model_based_rl_zh
    imitation_learning_zh
    exploration_rl_zh
    multi_agent_cooperation_rl_zh
    offline_rl_zh
    safe_rl_zh
    distributed_rl_zh
    league_zh