W2V2 LayerNorm location #1

CaptainPrice2023 · 2023-04-01T23:11:21Z

Hi, thanks for the sharing! I have a question about the adapter location in W2V2.

W2V2 transformer encoder applies LN after attention. But after adding adapter, should the adapter computation be conducted after LN layers instead of before it?

IPET/VoxCeleb1/W2V2/models/W2V2.py

Lines 547 to 557 in 2e4b0e3

    
           hidden_states = self.dropout(hidden_states) 
        
           hidden_states = attn_residual + hidden_states 
        
           # adapter 
        
           if args.adapter: adapt_h = self.adapter(hidden_states) 
        
           hidden_states = self.layer_norm(hidden_states) 
        
           hidden_states = hidden_states + self.feed_forward(hidden_states)  
        
           if args.adapter: hidden_states = hidden_states+ adapt_h 
        
           hidden_states = self.final_layer_norm(hidden_states)

wngh1187 · 2023-04-04T05:36:07Z

Hi. First of all, thank you very much for your interest in our research.

We positioned the adapter at the end of the attention block.
So for W2V2, it is specified in the following order: attention - adapter - layer norm - MLP - layer norm.
The figures in the paper are based on AST, which is why there is this difference.

But we haven't tried specifying the adapter after the LN, so we don't know what the result will be.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

W2V2 LayerNorm location #1

W2V2 LayerNorm location #1

CaptainPrice2023 commented Apr 1, 2023

wngh1187 commented Apr 4, 2023

W2V2 LayerNorm location #1

W2V2 LayerNorm location #1

Comments

CaptainPrice2023 commented Apr 1, 2023

wngh1187 commented Apr 4, 2023