Quick Start

SUMO entries

DI-smartcross supports DQN, Off-policy PPO and Rainbow DQN RL methods with multi-discrete actions for each crossing. A set of default DI-engine configs is provided for each policy. You can check the document of DI-engine to get detailed instructions on these configs.

Train RL Policies

The type of policy can be automatically parsed from the config file.

usage: sumo_train [-h] -d DING_CFG -e ENV_CFG [-s SEED] [--dynamic-flow]
                  [-cn COLLECT_ENV_NUM] [-en EVALUATE_ENV_NUM]
                  [--exp-name EXP_NAME]

DI-smartcross training script

optional arguments:
-h, --help            show this help message and exit
-d DING_CFG, --ding-cfg DING_CFG
                        DI-engine configuration path
-e ENV_CFG, --env-cfg ENV_CFG
                        sumo environment configuration path
-s SEED, --seed SEED  random seed for sumo
--dynamic-flow        use dynamic route flow
-cn COLLECT_ENV_NUM, --collect-env-num COLLECT_ENV_NUM
                        collector sumo env num for training
-en EVALUATE_ENV_NUM, --evaluate-env-num EVALUATE_ENV_NUM
                        evaluator sumo env num for training
--exp-name EXP_NAME   experiment name to save log and ckpt

Example of running DQN in wj3 env with default config.

Note

Running with dynamic flow is only supported for arterial7 env currently.

sumo_train -e smartcross/envs/sumo_wj3_default_config.yaml -d entry/config/sumo_wj3_dqn_default_config.py

Evaluate Existing Policies

We provide two eval policies: random and fixed-time. You can choose one to evaluate as comparison. It is suggested to use the eval_default_config for each env.

usage: sumo_eval [-h] [-d DING_CFG] -e ENV_CFG [-s SEED]
                 [-p {random,fix,dqn,rainbow,ppo}] [--dynamic-flow]
                 [-n ENV_NUM] [--gui] [-c CKPT_PATH]

DI-smartcross testing script

optional arguments:
-h, --help            show this help message and exit
-d DING_CFG, --ding-cfg DING_CFG
                        DI-engine configuration path
-e ENV_CFG, --env-cfg ENV_CFG
                        sumo environment configuration path
-s SEED, --seed SEED  random seed for sumo
-p {random,fix,dqn,rainbow,ppo}, --policy-type {random,fix,dqn,rainbow,ppo}
                        RL policy type
--dynamic-flow        use dynamic route flow
-n ENV_NUM, --env-num ENV_NUM
                        sumo env num for evaluation
--gui                 open gui for visualize
-c CKPT_PATH, --ckpt-path CKPT_PATH
                        model ckpt path

Example of running random policy in wj3 env.

sumo_eval -p random -e smartcross/envs/sumo_wj3_default_config.yaml

CityFlow Entries

DI-smartcross provides a simple DQN and Off-policy PPO demo for CityFlow env. Each policy comes with a default DI-engine configs are provided for each policy. You can check the document of DI-engine to get detailed instructions on these configs.

Train RL Policies

usage: cityflow_train [-h] -d DING_CFG -e ENV_CFG [-s SEED]
                      [-cn COLLECT_ENV_NUM] [-en EVALUATE_ENV_NUM]
                      [--exp-name EXP_NAME]

DI-smartcross training script

optional arguments:
-h, --help            show this help message and exit
-d DING_CFG, --ding-cfg DING_CFG
                        DI-engine configuration path
-e ENV_CFG, --env-cfg ENV_CFG
                        cityflow json configuration path
-s SEED, --seed SEED  random seed
-cn COLLECT_ENV_NUM, --collect-env-num COLLECT_ENV_NUM
                        collector env num for training
-en EVALUATE_ENV_NUM, --evaluate-env-num EVALUATE_ENV_NUM
                        evaluator env num for training
--exp-name EXP_NAME   experiment name to save log and ckpt

Evaluate Existing Policies

Note that CityFlow will run in fixed-time mode by default when not in rl mode. So the fix policy runs with an auto_config.json.

usage: cityflow_eval [-h] [-d DING_CFG] -e ENV_CFG [-s SEED]
                     [-p {fix,dqn,ppo}] [-n ENV_NUM] [-c CKPT_PATH]

DI-smartcross training script

optional arguments:
-h, --help            show this help message and exit
-d DING_CFG, --ding-cfg DING_CFG
                        DI-engine configuration path
-e ENV_CFG, --env-cfg ENV_CFG
                        sumo environment configuration path
-s SEED, --seed SEED  random seed for sumo
-p {fix,dqn,ppo}, --policy-type {fix,dqn,ppo}
                        RL policy type
-n ENV_NUM, --env-num ENV_NUM
                        sumo env num for evaluation
-c CKPT_PATH, --ckpt-path CKPT_PATH
                        model ckpt path