lightrft.utils.timer¶

Timer Module for Performance Measurement

This module provides a comprehensive timing utility for measuring execution time of different code sections. It’s particularly useful for profiling machine learning training loops, evaluation processes, and any code that requires performance monitoring.

The module includes a Timer class that can be used both as a context manager and through class methods for flexible timing operations. It supports CUDA synchronization for accurate GPU timing and provides step-wise statistics reporting for monitoring performance across training iterations.

Example:

# Using as context manager
with Timer("data_loading"):
    # Your data loading code here
    pass

# Using class methods
Timer.start("forward_pass")
# Your forward pass code
Timer.stop("forward_pass")

# Print statistics and reset
Timer.step()

Timer¶

class lightrft.utils.timer.Timer(name: str)[source]¶

A simple timer class for measuring the execution time of different code sections.

This class provides functionality to start, stop, and record the elapsed time for named timers. It also supports printing timing statistics at each step, which is particularly useful for monitoring training or evaluation loops.

The Timer class maintains global state across all instances, allowing you to collect timing data from multiple parts of your code and then view aggregated statistics. It automatically handles CUDA synchronization when available to ensure accurate timing measurements on GPU operations.

Parameters:

_timers (defaultdict(float)) – A dictionary to store the total elapsed time for each named timer.
_counts (defaultdict(int)) – A dictionary to store the number of times each named timer has been called.
_current_times (dict) – A dictionary to store the start time of the currently active timers.
_current_step (int) – A counter for the current step, used for printing step-wise statistics.

Example:

# Method 1: Using as context manager
with Timer("data_processing"):
    process_data()

# Method 2: Using start/stop methods
Timer.start("model_forward")
output = model(input)
Timer.stop("model_forward")

# Print statistics for current step
Timer.step()

__enter__()[source]¶

Starts the timer when entering the context.

This method is called when using the Timer as a context manager with the ‘with’ statement. It starts timing for the timer name specified during initialization.

Returns:: The Timer instance itself.
Return type:: Timer

Example:

with Timer("my_operation") as timer:
    # Code to time
    pass

__exit__(exc_type, exc_val, exc_tb)[source]¶

Stops the timer when exiting the context.

This method is called when exiting the context manager, regardless of whether an exception occurred. It stops timing for the timer name and records the elapsed time.

Parameters:

exc_type – The exception type if an exception was raised, None otherwise.
exc_val – The exception value if an exception was raised, None otherwise.
exc_tb – The exception traceback if an exception was raised, None otherwise.

Returns:

False to indicate that exceptions should not be suppressed.

Return type:

bool

__init__(name: str)[source]¶

Initializes a Timer instance.

Parameters:: name (str) – The name of this specific timer instance. This name will be used when entering and exiting the context manager.

Example:

timer = Timer("my_operation")
with timer:
    # Code to time
    pass

classmethod start(name: str)[source]¶

Starts a timer with the given name.

If a timer with the same name is already running, its start time will be overwritten. This method automatically handles CUDA synchronization if CUDA is available to ensure accurate timing measurements.

Parameters:: name (str) – The name of the timer to start.

Example:

Timer.start("data_loading")
# Your code here
Timer.stop("data_loading")

classmethod step()[source]¶

Prints the timing statistics for the current step and resets the timers.

If distributed training is used, it only prints the statistics on the main process (rank 0). The statistics include the total time, average time, and number of calls for each timer recorded since the last step. After printing, all timer data is cleared to start fresh for the next step.

This method is particularly useful in training loops where you want to monitor performance on a per-epoch or per-batch basis.

Example:

for epoch in range(num_epochs):
    with Timer("epoch"):
        for batch in dataloader:
            with Timer("forward"):
                output = model(batch)
            with Timer("backward"):
                loss.backward()
    Timer.step()  # Print and reset timers for this epoch

classmethod stop(name: str)[source]¶

Stops a timer with the given name and records the elapsed time.

The elapsed time is added to the total time for this timer, and the call count is incremented. If the timer was not started, a warning message will be printed.

Parameters:: name (str) – The name of the timer to stop.

Example:

Timer.start("computation")
result = heavy_computation()
Timer.stop("computation")