You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As per my understanding, FLOPs calculation is usually done on complete model, but, I am trying to test computational cost comparison of only LSH attention module of Reformer by providing it random input vectors. This LSH attention module switches between LSH hashing and full dot product based attention using setting flag use_full_attn=False and use_full_attn=True.
But the problem is that whatever size of input vectors I set for qk and v, the number of FLOPs appear to be same for both calculations.
By setting use_full_attn=False and use_full_attn=True the attention model is switched between LSH based attention and Full attention. I have verified this in debug mode of Spyder IDE.
Am I missing something?
How can I verify this? I would be grateful if someone can help me.
Code: (From Reformer Github website)
import torch
from reformer_pytorch import LSHAttention
model = LSHSelfAttention(
> dim = 128,
heads = 8,
bucket_size = 64,
n_hashes = 16,
causal = True,
use_full_attn=**False**,
return_attn = False
).to(device)
qk = torch.randn(10, 1024, 128)
v = torch.randn(10, 1024, 128)
x = torch.randn(1, 1024, 128).to(device)
y = model(x) # (10, 1024, 128)
Code for FLOPs calculation: (https://github.com/cszn/KAIR/blob/master/utils/utils_modelsummary.py)
with torch.no_grad():
input_dim = (1, 16384, 128) # set the input dimension
flops = get_model_flops(model, input_dim, False)
print('{:>16s} : {:<.4f} [G]'.format('FLOPs', flops/10**9))
```Result in both cases:
FLOPs : 0.8053 [G]
The text was updated successfully, but these errors were encountered:
As per my understanding, FLOPs calculation is usually done on complete model, but, I am trying to test computational cost comparison of only LSH attention module of Reformer by providing it random input vectors. This LSH attention module switches between LSH hashing and full dot product based attention using setting flag use_full_attn=False and use_full_attn=True.
But the problem is that whatever size of input vectors I set for qk and v, the number of FLOPs appear to be same for both calculations.
By setting use_full_attn=False and use_full_attn=True the attention model is switched between LSH based attention and Full attention. I have verified this in debug mode of Spyder IDE.
Am I missing something?
How can I verify this? I would be grateful if someone can help me.
Code: (From Reformer Github website)
The text was updated successfully, but these errors were encountered: