Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENHANCEMENT] Add layer name in a layer to improve code debugging #1198

Open
rybakov opened this issue Oct 4, 2024 · 0 comments
Open

[ENHANCEMENT] Add layer name in a layer to improve code debugging #1198

rybakov opened this issue Oct 4, 2024 · 0 comments

Comments

@rybakov
Copy link

rybakov commented Oct 4, 2024

Is your feature request related to a problem? Please describe.
I am adding new features in TranformerEngine(TE) and observe issues with model quality (gap in the loss with loss spikes).
I am debugging Megatron with TE, by storing tensor statistics in impacted layers.
But I do not have information about layer name and layer order(index) in the model topology.

Describe the solution you'd like
It would be great to add proper layer name with its order in the model, so that customers can use it for model debugging.

Describe alternatives you've considered
There are multiple frameworks which support this simple feature, e.g:
Lingvo based on TF
Praxis based on JAX

Proposed implementation
I propose to add a layer_name filed which will be a unique name with layer hierarchy and its index/order (if there are multiple layers with the same name)

Here is an example:
'''
class TransformerBlock(MegatronModule):
"""Transformer class."""

def __init__(
    self,
   ...
    layer_name: str = "TransformerBlock",
):
    # offset is implicit in TransformerLayer
    self.layers = torch.nn.ModuleList(
        [
            build_layer(layer_spec, i + 1, f"{self.layer_name}.blocks" if self.layer_name else None)
            for i, layer_spec in enumerate(self.submodules.layer_specs)
        ]
    )

'''
Additional context
In our local branch, this feature is already used by multiple people.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant