lightrft.models.monkey_patch.apply¶
This module provides monkey patching functionality for attention mechanisms in LLaMA and Qwen2 models. It allows replacing the original attention forward methods with custom implementations for better performance or different behavior.
The module supports patching both LlamaAttention and Qwen2Attention classes by replacing their forward methods with custom implementations defined in separate modules.
- lightrft.models.monkey_patch.apply.apply_monkey_patch_to_llama()[source]¶
Apply monkey patch to LlamaAttention class by replacing its forward method.
This function replaces the original forward method of LlamaAttention with a custom implementation defined in llama_attn_forward. This can be used to modify the attention mechanism’s behavior or improve its performance.
- Returns:
None
- lightrft.models.monkey_patch.apply.apply_monkey_patch_to_qwen2()[source]¶
Apply monkey patch to Qwen2Attention class by replacing its forward method.
This function replaces the original forward method of Qwen2Attention with a custom implementation defined in qwen2_attn_forward. This can be used to modify the attention mechanism’s behavior or improve its performance.
- Returns:
None