Carla Benchmark Evaluation¶
DI-drive provides hands-on benchmark environment settings [1] for Carla simulator. The benchmark setting consists of several suites. Each suite contains a town map, several kind of weather, num of other vehicles and pedestrians, and a .txt file which provides routes’ start and target waypoints. The standard benchmark evaluation provides three kinds of routes in ‘town1’ and ‘town2’, which are ‘straight’, ‘turn’ and ‘full’. For example, FullTown01-v2 refers to an evaluation suite in town1 and ‘full’ routes. Other settings like weathers and NPCs are defined by the number of version. Some suites are aliased for convenience, for example town1 represents all commonly used suites in ‘town1’.
Carla benchmark setting is widely used in other literature as training and evaluation standard.
DI-drive allows users to quickly build up an environment with a benchmark suite setting by deploying
BenchmarkEnvWrapper
. Once an environment is wrapped, it will ignore
the passed arguments in reset
method, and choose one from the suite’s route list according to configuration. It can be used
in RL training, to sampling data in benchmark environment settings.
Also, DI-drive deploys CarlaBenchmarkEvaluator
and
CarlaBenchmarkCollector
to run with a policy
and a EnvManager
in DI-engine that can run several environments in parallel. They can parse reset params in the provided
suite to a list and run all the episodes in order.
The Evaluator
is used to evaluate a policy by running an amount or the entire benchmark suits,
for example, run 50 or 100 routes in FullTown02-v2 suite to test its successful rate
in the suite. The Collector
is used to sample episodes to make IL datasets.
Note
Check these modules in API doc for detail information.
Sample images¶
You can check routes in each benchmark suite file in benchmark and find start & end waypoint. Here we show benchmark settings for ‘Town01’ and ‘Town02’[2].
Performance Metrics¶
CILRS
Suite |
Success rate |
Total run |
Seed |
---|---|---|---|
FullTown01-v1 |
70% |
70/100 |
0 |
FullTown02-v1 |
99% |
99/100 |
0 |
Implicit Affordance (single lane)
Suite |
Success rate |
Total run |
Seed |
---|---|---|---|
FullTown01-v1 |
100% |
50/50 |
0 |
FullTown01-v2 |
100% |
50/50 |
0 |
FullTown01-v3 |
90% |
45/50 |
0 |
FullTown01-v4 |
86% |
43/50 |
0 |
Implicit Affordance (multi lane)
Suite |
Success rate |
Total run |
Seed |
---|---|---|---|
FullTown04-v1 |
42% |
21/50 |
0 |
FullTown04-v2 |
42% |
21/50 |
0 |
FullTown04-v3 |
42% |
21/50 |
0 |
FullTown04-v4 |
50% |
25/50 |
0 |
ChangeLaneTown04-v1 |
92% |
46/50 |
0 |
ChangeLaneTown04-v2 |
92% |
46/50 |
0 |
Simple RL DQN
Suite |
Success rate |
Total run |
Seed |
---|---|---|---|
FullTown01-v1 |
92% |
23/25 |
0 |
FullTown01-v2 |
88% |
22/25 |
0 |
FullTown02-v1 |
80% |
20/25 |
0 |
FullTown02-v2 |
72% |
18/25 |
0 |
Simple RL PPO
Suite |
Success rate |
Total run |
Seed |
---|---|---|---|
FullTown01-v1 |
80% |
20/25 |
0 |
FullTown02-v1 |
96% |
24/25 |
0 |