Minor questions about the paper and code #1

os-hxfan · 2021-12-07T11:51:34Z

Thanks a lot for the interesting work!
I am really enjoying reading the paper and the code.
I actually have two minor questions. It will be really appreciated if any hints can be provided:

I notice that, in Section 5.1, Pixelfly is only applied on the projection step of Attention and MLP, without sparsifying the attention matrix (score matrix). While in T2T-Vit, the Pixelfly is only applied on the attention matrix without sparsifying MLP and projection. Are there any reasons for this? Also, are there any experimental results if Pixelfly is applied on all layers (MLP and attention matrix)?
I saw there are many options for /model/t2tattn_cfg with T2T-Vit, such as sblocal, performer. It seems like sblocal uses sparse + low rank. Maby I know which one should I choose if I want to use flat butterfly + low rank?
In the experiment folder under config, it seems like only the scripts for MLP-mixer, T2T-vit are provided. Do you have plans to release all scripts for other experiments? such as Vit, GPT etc......

abhishektyaagi · 2024-01-22T16:40:55Z

@os-hxfan
Is it possible for you to share with me if you were able to figure out an answer to:

I saw there are many options for /model/t2tattn_cfg with T2T-Vit, such as sblocal, performer. It seems like sblocal uses sparse + low rank.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor questions about the paper and code #1

Minor questions about the paper and code #1

os-hxfan commented Dec 7, 2021

abhishektyaagi commented Jan 22, 2024

Minor questions about the paper and code #1

Minor questions about the paper and code #1

Comments

os-hxfan commented Dec 7, 2021

abhishektyaagi commented Jan 22, 2024