Adaptive lora - some questions on ggml #733

cmp-nct · 2024-02-15T03:21:32Z

cmp-nct
Feb 15, 2024

class PLoRA(nn.Linear):

    def __init__(self,
                 in_features: int,
                 out_features: int,
                 bias: bool = True,
                 device=None,
                 dtype=None,
                 lora_r=8,
                 lora_alpha=16,
                 lora_dropout=0.05,
                 lora_len=0,
                 **kwargs) -> None:
        super().__init__(in_features, out_features, bias, device, dtype)
        self.lora_r = lora_r
        self.lora_alpha = lora_alpha
        self.lora_len = lora_len
        if lora_dropout > 0.:
            self.lora_dropout = nn.Dropout(p=lora_dropout)
        else:
            self.lora_dropout = lambda x: x
        self.lora_scaling = self.lora_alpha / self.lora_r

        self.Plora_A = nn.Linear(
            in_features, self.lora_r, bias=False, device=device, dtype=dtype)
        self.Plora_B = nn.Linear(
            self.lora_r, out_features, bias=False, device=device, dtype=dtype)

        self.reset_parameters()

    def reset_parameters(self):
        if hasattr(self, 'lora_A'):
            # initialize A the same way as the default for nn.Linear and B to zero
            nn.init.kaiming_uniform_(self.lora_A.weight, a=math.sqrt(5))
            nn.init.zeros_(self.lora_B.weight)

    def forward(self, x, im_mask=None):
        res = super().forward(x)
        if im_mask is not None:
            if torch.sum(im_mask) > 0:
                part_x = x[im_mask]
                res[im_mask] += self.Plora_B(
                    self.Plora_A(
                        self.lora_dropout(part_x))) * self.lora_scaling
            else:
                part_x = x[:, :1]
                res[:, :1] += self.Plora_B(
                    self.Plora_A(self.lora_dropout(part_x))) * 0
        return res

I'm quite unfamiliar with pytorch, though some parts of it appear elegant to use.
part_x = x[im_mask]
x is the input tensor
im_mask contains a list of True/False
part_x is a new tensor, only the "True" of x have been returned (a segment in the middle).

Question Mask:
What is the best way to implement that in ggml ? Especially in a graph.
If I know the positions beforehand, I suppose I can use a View with an offset.
But is there a clean solution to produce a tensor based on a mask ?

Question dropout:
Someone knows what the dropout is doing with the tensor (inference) ? Is it just a scale of 0.05 ?

Lora
res[im_mask] + = ... This is a double matmul of the two lora tensors + scaling and then assigning it to the masked part of the tensor ?
I assume copying a separate created tensor into a "masked View" is the right solution here again ?

The whole class looks so simple, I struggle to replicate it.

In general it would be nice to have a documentation for ggml, even if just giving a short explanations of all functions/operations available and maybe their python equivalent.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adaptive lora - some questions on ggml #733

{{title}}

Replies: 0 comments

Select a reply

Adaptive lora - some questions on ggml #733

cmp-nct Feb 15, 2024

Replies: 0 comments

cmp-nct
Feb 15, 2024