LoRA变体
AdaLoRA
QLoRA
DoRA
DoRA(Weight-Decomposed Low-Rank Adaptation)的主要思想是将预训练权重分解为幅度(magnitude)和方向(direction),并利用 LoRA 来微调方向矩阵。

class LinearWithDoRAMerged(nn.Module):
def __init__(self, linear, rank, alpha):
super().__init__()
self.linear = linear
self.lora = LoRALayer(
linear.in_features, linear.out_features, rank, alpha
)
self.m = nn.Parameter(
self.linear.weight.norm(p=2, dim=0, keepdim=True))
# Code loosely inspired by
# https://github.com/catid/dora/blob/main/dora.py
def forward(self, x):
lora = self.lora.A @ self.lora.B
numerator = self.linear.weight + self.lora.alpha*lora.T
denominator = numerator.norm(p=2, dim=0, keepdim=True)
directional_component = numerator / denominator
new_weight = self.m * directional_component
return F.linear(x, new_weight, self.linear.bias)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
LoRA 通常会等比例增减幅度和方向,DoRA 通过将预训练权重矩阵分解为幅度和方向,能够更接近全量微调的效果。
上次更新: 2024/07/12, 15:43:54