policy

BaseCarlaPolicy

class core.policy.base_carla_policy.BaseCarlaPolicy(cfg: dict, model: Any | None = None, enable_field: List[str] | None = None)[source]

Base class for Carla policy interact with environments. The policy is defined in standard DI-engine form which has several modes to change its running form, and can interact with several environments controlled by a EnvManager. The policy is designed to support Supervised Learning, Reinforcement Learning and other method as well as expert policy, each may have different kinds of interfaces and modes.

By default, it has 3 modes: learn, collect and eval. To set policy to a specific mode, call the policy with policy.xxx_mode. Then all the supported interfaces can be defined in _interface_xxx or _interfaces method. For example, calling policy.collect_mode.forward is equal to calling policy._forward_collect. Some mode-specific interfaces may be defined specially by user.

Interfaces:

init, forward, reset, process_transition, get_train_sample

AutoPIDPolicy

class core.policy.AutoPIDPolicy(cfg: Dict)[source]

Autonomous Driving policy follows target waypoint in env observations. It uses a Vehicle PID controller for each env with a specific env id related to it. In each updating, all envs should use the correct env id to make the PID controller works well, and the controller should be reset when starting a new episode.

The policy has 2 modes: collect and eval. Their interfaces operate in the same way. The only difference is that in collect mode the forward method may add noises to steer if set in config.

Arguments:
  • cfg (Dict): Config Dict.

Interfaces:

reset, forward

_forward_collect(data: Dict) Dict[source]

Running forward to get control signal of collect mode.

Arguments:
  • data (Dict): Input dict, with env id in keys and related observations in values,

Returns:

Dict: Control dict stored in values for each provided env id.

_forward_eval(data: Dict) Dict[source]

Running forward to get control signal of eval mode.

Arguments:
  • data (Dict): Input dict, with env id in keys and related observations in values,

Returns:

Dict: Control dict stored in values for each provided env id.

_reset_collect(data_id: List[int] | None = None) None[source]

Reset policy of collect mode. It will reset the controllers in provided env id. Noise will be add to the controller according to config.

Arguments:
  • data_id (List[int], optional): List of env id to reset. Defaults to None.

_reset_eval(data_id: List[int] | None = None) None[source]

Reset policy of eval mode. It will reset the controllers in providded env id.

Arguments:
  • data_id (List[int], optional): List of env id to reset. Defaults to None.

AutoMPCPolicy

class core.policy.AutoMPCPolicy(cfg: Dict)[source]

Autonomous Driving policy follows target waypoint list in env observations. It uses an MPC controller for each env with a specific env id related to it. In each updating, all envs should use the correct env id to make the MPC controller works well, and the controller should be reset when starting a new episode.

The policy has 2 modes: collect and eval. Their interfaces operate in the same way. The only difference is that in collect mode the forward method may add noises to steer if set in config.

Arguments:
  • cfg (Dict): Config Dict.

Interfaces:

reset, forward

_forward_collect(data: Dict) Dict[source]

Running forward to get control signal of collect mode.

Arguments:
  • data (Dict): Input dict, with env id in keys and related observations in values,

Returns:

Dict: Control dict stored in values for each provided env id.

_forward_eval(data: Dict) Dict[source]

Running forward to get control signal of eval mode.

Arguments:
  • data (Dict): Input dict, with env id in keys and related observations in values,

Returns:

Dict: Control dict stored in values for each provided env id.

_reset_collect(data_id: List[int] | None = None) None[source]

Reset policy of collect mode. It will reset the controllers in provided env id. Noise will be add to the controller according to config.

Arguments:
  • data_id (List[int], optional): List of env id to reset. Defaults to None.

_reset_eval(data_id: List[int] | None = None) None[source]

Reset policy of eval mode. It will reset the controllers in providded env id.

Arguments:
  • data_id (List[int], optional): List of env id to reset. Defaults to None.

CILRSPolicy

class core.policy.CILRSPolicy(cfg: Dict, enable_field: List = ['eval', 'learn'])[source]

CILRS driving policy. It has a CILRS NN model which can handle observations from several environments by collating data into batch. It contains 2 modes: eval and learn. The learn mode will calculate all losses, but will not back-propregate it. In eval mode, the output control signal will be postprocessed to standard control signal in Carla, and it can avoid stopping in the staring ticks.

Arguments:
  • cfg (Dict): Config Dict.

  • enable_field(List): Enable policy filed, default to [‘eval’, ‘learn’]

Interfaces:

reset, forward

LBCBirdviewPolicy

class core.policy.LBCBirdviewPolicy(cfg: dict, enable_field: List = ['eval', 'learn'])[source]

LBC driving policy with Bird-eye View inputs. It has an LBC NN model which can handle observations from several environments by collating data into batch. Each environment has a PID controller related to it to get final control signals. In each updating, all envs should use the correct env id to make the PID controller works well, and the controller should be reset when starting a new episode.

It contains 2 modes: eval and learn. The learn mode will calculate all losses, but will not back-propregate it. In eval mode, the output control signal will be postprocessed to standard control signal in Carla.

Arguments:
  • cfg (Dict): Config Dict.

  • enable_field(List): Enable policy filed, default to [‘eval’, ‘learn’]

Interfaces:

reset, forward

_forward_eval(data: Dict) Dict[str, Any][source]

Running forward to get control signal of eval mode.

Arguments:
  • data (Dict): Input dict, with env id in keys and related observations in values,

Returns:

Dict: Control and waypoints dict stored in values for each provided env id.

_reset_eval(data_ids: List[int] | None = None) None[source]

Reset policy of eval mode. It will change the NN model into ‘eval’ mode and reset the controllers in provided env id.

Arguments:
  • data_id (List[int], optional): List of env id to reset. Defaults to None.

LBCImagePolicy

class core.policy.LBCImagePolicy(cfg: dict, enable_field: List = ['eval', 'learn'])[source]

LBC driving policy with RGB image inputs. It has an LBC NN model which can handle observations from several environments by collating data into batch. Each environment has a PID controller related to it to get final control signals. In each updating, all envs should use the correct env id to make the PID controller works well, and the controller should be reset when starting a new episode.

Arguments:
  • cfg (Dict): Config Dict.

Interfaces:

reset, forward

_forward_eval(data: Dict) Dict[source]

Running forward to get control signal of eval mode.

Arguments:
  • data (Dict): Input dict, with env id in keys and related observations in values,

Returns:

Dict: Control and waypoints dict stored in values for each provided env id.

_reset_eval(data_ids: List[int] | None) None[source]

Reset policy of eval mode. It will change the NN model into ‘eval’ mode and reset the controllers in provided env id.

Arguments:
  • data_id (List[int], optional): List of env id to reset. Defaults to None.