Skip to content

Commit

Permalink
Update neox_args.py
Browse files Browse the repository at this point in the history
These attention configuration options were missing from the docs. This will fix that.
  • Loading branch information
jahatef authored Nov 16, 2023
1 parent d8028f8 commit b18f25c
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion megatron/neox_arguments/neox_args.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ class NeoXArgsModel(NeoXArgsTemplate):
The first item in the list specifies the attention type(s), and should be a list of strings. The second item
specifies the number of times to repeat those attention types in the full list.
attention type choices: [global, local, sparse_fixed, sparse_variable, bslongformer, bigbird]
attention type choices: [global, local, sparse_fixed, sparse_variable, bslongformer, bigbird, "gmlp", "amlp", "flash"]
So a 12 layer network with only global attention could be specified like:
[[[`global`], 12]]
Expand Down

0 comments on commit b18f25c

Please sign in to comment.