Datasets¶
Benchmark Datasets¶
DI-drive defines a unified benchmark dataset format that makes data collection and loading procedure easily for users.
It is suggested to save datasets with BenchmarkDatasetSaver
.
It can automatically create folders and save all sensor data and measurements into datasets as desired form.
General structure¶
A dataset directory should look like the structure below. The following sections will explain each one of the components described.
<dataset_name>
│ dataset_metadata.json
│
└───episode_00000
│ │ episode_metadata.json
│ │ <Camera1_name>_00000.png
│ │ ...
│ │ <Camera2_name>_00000.png
│ │ ...
│ │ <Lidar_name>_00000.png
│ │ ...
│ │ measurements_00000.lmdb
│ │ ...
└───episode_00001
│ ...
│
...
A dataset contains dataset metadata, episode metadata, sensor data and measurements.
Dataset metadata¶
Each dataset contains a metadata file with information provided by user. It may have the following contents:
Number of episodes
Collected suite
Obs image types and names
Episode metadata¶
Each episode is stored in a folder. For each collected episode we generate a json file containing its general aspects that are:
Town map name.
Start and end waypoint indexes
Number of Pedestrians: the total number of spawned pedestrians.
Number of Vehicles: the total number of spawned vehicles.
Spawned seed for pedestrians and vehicles: the random seed used for the CARLA object spawning process.
Weather: the weather of the episode.
Each episode lasts from 1-5 minutes partitioned in simulation steps of 100 ms. For each step, we store data divided into two different categories, sensor data stored as PNG images, and measurement data stored as json files.
Sensor data¶
All images collected are stored as png files. The name consists of its tag in observation configurations and the frame number.
Measurements¶
Measurements represent all the float data collected for each simulation step. Each measurement arranges its content in a fixed order, and stores them in a .lmdb file. The content is shown follow:
tick (int)
timestamp (float)
forward_vector (2D)
acceleration (3D)
location (3D)
speed (float)
command (int)
steer (float)
throttle (float)
brake (float)
real_steer (float)
real_throttle (float)
real_brake (float)
tl_state (int)
tl_dis (float)
Their meaning is the same as the observation returned in SimpleCarlaEnv
Others¶
It is allowed to add user customized data into datasets. The data can be post-processed and stored in an ‘other’ key, with no effect to measurements. Users can organize their necessary information into datasets freely.