Shortcuts

Middleware specification

File Structure and Naming

The middleware in DI-engine can be divided into two categories, one we call function, which is an atomic operation method, focusing on doing one thing with a few lines of code, such as train middleware Execute the training of the model; The other, which we call module, may combine multiple functions to perform more complex logic. This classification refers to pytorch’s nn and nn.functional.

Essentially they all belong to middleware, and the usage is exactly the same.

In the directory structure, module is placed directly in the middleware directory, named after a noun; function is placed in the middleware/functional directory, named after a verb or noun.

Multiple middleware of the same type can be written in one file.

ding/
  framework/
    middleware/
      functional/collect.py # Function
      collector.py # Module

Class, Function, Parameter

When writing function, functional-style code is recommended due to the brevity of the code; when writing module, classes are recommended. E.g:

Passing explicitly named arguments is recommended for all functions, passing dict as arguments is deprecated. For too many parameters, we recommend using TreeTensor.

Construction Function

Most middleware has two layers of methods, such as the outer function of function and the __init__ function of module, which are used to pass parameters and objects necessary for the middleware to run.

The return function of function and the __call__ method of module are the processes that are called cyclically at runtime, and only support ctx as one parameter.

It is recommended to instantiate the object externally to pass to the middleware, rather than inside the middleware, to ensure stateless and procedural middleware:

Runtime Function

When writing the return function of function or the __call__ method of module, there are a few things to keep in mind:

  1. If there is an infinite loop in the method, make sure to check the task.finish condition to exit:

  1. task supports two modes of sequential execution and asynchronous execution. The data passed by ctx may not be generated at the same time in the two modes.

    It is necessary to pay attention to judgment in the middleware, and it is best to support both Two modes:

  1. It is not recommended to open multiple processes inside the middleware, so as to avoid unforeseen problems caused by too many instantiated objects in the front, or multi-level nesting of processes. If you need to use multi-process parallelism, you can split the logic into multiple middleware. , using the parallel capabilities of DI-engine to execute:

Event Naming Convention

When using the event mechanism in DI-engine, we agree that events are named according to the following specifications:

  1. Events with the purpose of broadcasting data are named using (emitting position)_(data name)[_(parameter name)_(parameter value)], for example: league_job_actor_0 (broadcast data from league to actor, passing job)

  2. Events for remote invocation are named with (receive location)_(method name), for example: league_get_job (call league’s get_job function).