You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Bert LM-Head hardcode's gelu as the activation, this has caused problems in downstream projects (e.g. NeMo).
Describe the solution you'd like
Rather than hardcoding nn.gelu, we accept the field set to TransformerConfig.activation_func as done for other models.
Describe alternatives you've considered
did not consider alternatives
Is your feature request related to a problem? Please describe.
Bert LM-Head hardcode's gelu as the activation, this has caused problems in downstream projects (e.g. NeMo).
Describe the solution you'd like
Rather than hardcoding nn.gelu, we accept the field set to
TransformerConfig.activation_func
as done for other models.Describe alternatives you've considered
did not consider alternatives
Proposed implementation
PR.
Additional context
Spoke with @shanmugamr1992 and used his suggested solution.
The text was updated successfully, but these errors were encountered: