Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested weight munging fine-tuning/continue training example and test for example model and geneformer. #97

Merged
merged 55 commits into from
Sep 4, 2024

Conversation

jstjohn
Copy link
Collaborator

@jstjohn jstjohn commented Aug 14, 2024

Changes:

  1. Pattern laid out and demonstrated in fine-tuning example.
  2. More involved example in Geneformer which also demonstrates how to load config parameters from the starting checkpoint in a way that does not override the parent.
  3. ESM2 config and config ABC heirarchy is modified so that once training is merged for ESM2 (see Complete ESM2 pretraining #112) we can do a similar fine-tuning example/test for ESM2 as for geneformer. See Abstract IOMixin and ability to mutate init parameters. NeMo#10239 for changes on the NeMo side that allow for abstractions around modifying the saved hyper-params on a class.

@jstjohn
Copy link
Collaborator Author

jstjohn commented Aug 14, 2024

/build-ci

@jstjohn jstjohn force-pushed the subclassed_weight_munging_finetuning_example branch from 8f96b1d to b37760d Compare August 14, 2024 19:40
@jstjohn
Copy link
Collaborator Author

jstjohn commented Aug 14, 2024

/build-ci

@malcolmgreaves malcolmgreaves marked this pull request as draft August 14, 2024 20:17
Signed-off-by: John St John <[email protected]>
@jstjohn
Copy link
Collaborator Author

jstjohn commented Aug 14, 2024

/build-ci

@jstjohn
Copy link
Collaborator Author

jstjohn commented Aug 20, 2024

/build-ci

@jstjohn
Copy link
Collaborator Author

jstjohn commented Aug 28, 2024

/build-ci

Copy link
Collaborator

@farhadrgh farhadrgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, One question is how do we expect the user to configure a fine-tune example with a model by skipping decoder or some layers at the top? Could we override any config params within the dataclass or do we have to change the global var OVERRIDE_BIONEMO_CONFIG_DEFAULTS

@jstjohn
Copy link
Collaborator Author

jstjohn commented Aug 31, 2024

/build-ci

2 similar comments
@jstjohn
Copy link
Collaborator Author

jstjohn commented Sep 3, 2024

/build-ci

@ohadmo
Copy link
Member

ohadmo commented Sep 3, 2024

/build-ci

@jstjohn
Copy link
Collaborator Author

jstjohn commented Sep 3, 2024

/build-ci

@jstjohn
Copy link
Collaborator Author

jstjohn commented Sep 3, 2024

/build-ci

@ohadmo
Copy link
Member

ohadmo commented Sep 3, 2024

/build-ci

@jstjohn
Copy link
Collaborator Author

jstjohn commented Sep 3, 2024

/build-ci

@jstjohn jstjohn enabled auto-merge (squash) September 3, 2024 23:48
@jstjohn
Copy link
Collaborator Author

jstjohn commented Sep 4, 2024

/build-ci

@jstjohn
Copy link
Collaborator Author

jstjohn commented Sep 4, 2024

/build-ci

@jstjohn jstjohn merged commit 0eac281 into v2-main Sep 4, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants