Nested weight munging fine-tuning/continue training example and test for example model and geneformer. #97

jstjohn · 2024-08-14T00:45:44Z

Changes:

Pattern laid out and demonstrated in fine-tuning example.
More involved example in Geneformer which also demonstrates how to load config parameters from the starting checkpoint in a way that does not override the parent.
ESM2 config and config ABC heirarchy is modified so that once training is merged for ESM2 (see Complete ESM2 pretraining #112) we can do a similar fine-tuning example/test for ESM2 as for geneformer. See Abstract IOMixin and ability to mutate init parameters. NeMo#10239 for changes on the NeMo side that allow for abstractions around modifying the saved hyper-params on a class.

jstjohn · 2024-08-14T19:17:43Z

/build-ci

… test Signed-off-by: John St John <[email protected]>

Signed-off-by: John St John <[email protected]>

…step Signed-off-by: John St John <[email protected]>

Signed-off-by: John St John <[email protected]>

jstjohn · 2024-08-14T19:40:17Z

/build-ci

sub-packages/bionemo-core/src/bionemo/core/model/config.py

Signed-off-by: John St John <[email protected]>

jstjohn · 2024-08-14T21:18:23Z

/build-ci

…ssed_weight_munging_finetuning_example

…ssed_weight_munging_finetuning_example Signed-off-by: John St John <[email protected]>

Signed-off-by: John St John <[email protected]>

…ssed_weight_munging_finetuning_example Signed-off-by: John St John <[email protected]>

jstjohn · 2024-08-20T17:58:30Z

/build-ci

Signed-off-by: John St John <[email protected]>

…ssed_weight_munging_finetuning_example Signed-off-by: John St John <[email protected]>

Signed-off-by: John St John <[email protected]>

jstjohn · 2024-08-28T22:29:02Z

/build-ci

farhadrgh

LGTM, One question is how do we expect the user to configure a fine-tune example with a model by skipping decoder or some layers at the top? Could we override any config params within the dataclass or do we have to change the global var OVERRIDE_BIONEMO_CONFIG_DEFAULTS

sub-packages/bionemo-esm2/src/bionemo/esm2/model/model.py

sub-packages/bionemo-llm/src/bionemo/llm/model/biobert/model.py

sub-packages/bionemo-geneformer/src/bionemo/geneformer/data/singlecell/datamodule.py

jstjohn · 2024-08-31T00:18:08Z

/build-ci

jstjohn · 2024-09-03T15:53:03Z

/build-ci

ohadmo · 2024-09-03T16:31:20Z

/build-ci

jstjohn · 2024-09-03T17:56:01Z

/build-ci

…s to pass on smaller GPUs

jstjohn · 2024-09-03T20:33:13Z

/build-ci

…munging_finetuning_example

ohadmo · 2024-09-03T21:28:47Z

/build-ci

jstjohn · 2024-09-03T23:36:41Z

/build-ci

jstjohn · 2024-09-04T00:01:47Z

/build-ci

…ssed_weight_munging_finetuning_example

jstjohn · 2024-09-04T15:15:54Z

/build-ci

jstjohn requested review from gwarmstrong and skothenhill-nv August 14, 2024 00:45

jstjohn added 7 commits August 14, 2024 19:40

Initial commit toward a nested weight munging fine-tuning example and…

2237843

… test Signed-off-by: John St John <[email protected]>

running fine-tune prototype

5565437

Tests passing with metrics sanity checks included

84ee541

Fix ruff-check issues

12b6323

update tach

b387660

Signed-off-by: John St John <[email protected]>

Simplify test so easier to see which checkpoints are going into each …

81ab09a

…step Signed-off-by: John St John <[email protected]>

Add comments to test

b37760d

Signed-off-by: John St John <[email protected]>

jstjohn force-pushed the subclassed_weight_munging_finetuning_example branch from 8f96b1d to b37760d Compare August 14, 2024 19:40

malcolmgreaves marked this pull request as draft August 14, 2024 20:17

malcolmgreaves reviewed Aug 14, 2024

View reviewed changes

sub-packages/bionemo-core/src/bionemo/core/model/config.py Outdated Show resolved Hide resolved

malcolmgreaves assigned jstjohn Aug 14, 2024

Fix ruff format error

7409442

Signed-off-by: John St John <[email protected]>

jstjohn added 10 commits August 14, 2024 21:20

Merge branch 'v2-main' of github.com:NVIDIA/bionemo-fw-ea into subcla…

2704b0b

…ssed_weight_munging_finetuning_example

Merge branch 'v2-main' of github.com:NVIDIA/bionemo-fw-ea into subcla…

385c9da

…ssed_weight_munging_finetuning_example Signed-off-by: John St John <[email protected]>

Address Malcolms config feedback

4f93094

Signed-off-by: John St John <[email protected]>

Rearrange the lightning basic file so its in nice blocks

7caab42

Adding a bit more inline docs to the config blob

d0df0dc

Signed-off-by: John St John <[email protected]>

Adding another block comment

85022a5

Signed-off-by: John St John <[email protected]>

More in-line comments on the lightning_basic class.

e8e2498

add todos and more comments

f13738a

Update base example

bf46f8e

Merge branch 'v2-main' of github.com:NVIDIA/bionemo-fw-ea into subcla…

821bfd3

…ssed_weight_munging_finetuning_example Signed-off-by: John St John <[email protected]>

jstjohn added 2 commits August 21, 2024 22:52

Work toward getting geneformer fine-tuning tested

a330e49

Signed-off-by: John St John <[email protected]>

Merge branch 'v2-main' of github.com:NVIDIA/bionemo-fw-ea into subcla…

f6eea29

…ssed_weight_munging_finetuning_example Signed-off-by: John St John <[email protected]>

jstjohn added 3 commits August 28, 2024 19:24

Add some required megatron settings ot the config

7eeedf1

Get lightning tests working again

f91790d

Signed-off-by: John St John <[email protected]>

bump nemo

316d8e1

farhadrgh approved these changes Aug 30, 2024

View reviewed changes

farhadrgh reviewed Aug 30, 2024

View reviewed changes

sub-packages/bionemo-geneformer/src/bionemo/geneformer/data/singlecell/datamodule.py Outdated Show resolved Hide resolved

jstjohn added 2 commits August 30, 2024 23:58

refactor out nemo changes since that PR is crawling

fa4134a

Merge in v2-main

5608ec2

jstjohn mentioned this pull request Aug 31, 2024

Abstract IOMixin and ability to mutate init parameters. NVIDIA/NeMo#10239

Closed

8 tasks

jstjohn added 2 commits August 31, 2024 00:13

revert to v2-main of megatron

f2630fc

start addressing feedback

a502f05

jstjohn added 2 commits September 3, 2024 16:39

Fix inheritance issue with io mixin

a918b7a

Update inheritance for fine-tune config for version with getters/setters

08ca893

malcolmgreaves approved these changes Sep 3, 2024

View reviewed changes

Update test models to make them smaller, hopefully allowing more test…

a919799

…s to pass on smaller GPUs

Merge remote-tracking branch 'origin/v2-main' into subclassed_weight_…

680386e

…munging_finetuning_example

Fix data dir in lightning_basic test

80f3dd0

jstjohn enabled auto-merge (squash) September 3, 2024 23:48

Update the name of the resampler

9065ac3

Merge branch 'v2-main' of github.com:NVIDIA/bionemo-fw-ea into subcla…

a12c6d6

…ssed_weight_munging_finetuning_example

jstjohn merged commit 0eac281 into v2-main Sep 4, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nested weight munging fine-tuning/continue training example and test for example model and geneformer. #97

Nested weight munging fine-tuning/continue training example and test for example model and geneformer. #97

jstjohn commented Aug 14, 2024 •

edited

Loading

jstjohn commented Aug 14, 2024

jstjohn commented Aug 14, 2024

jstjohn commented Aug 14, 2024

jstjohn commented Aug 20, 2024

jstjohn commented Aug 28, 2024

farhadrgh left a comment

jstjohn commented Aug 31, 2024

jstjohn commented Sep 3, 2024

ohadmo commented Sep 3, 2024

jstjohn commented Sep 3, 2024

jstjohn commented Sep 3, 2024

ohadmo commented Sep 3, 2024

jstjohn commented Sep 3, 2024

jstjohn commented Sep 4, 2024

jstjohn commented Sep 4, 2024

Nested weight munging fine-tuning/continue training example and test for example model and geneformer. #97

Nested weight munging fine-tuning/continue training example and test for example model and geneformer. #97

Conversation

jstjohn commented Aug 14, 2024 • edited Loading

jstjohn commented Aug 14, 2024

jstjohn commented Aug 14, 2024

jstjohn commented Aug 14, 2024

jstjohn commented Aug 20, 2024

jstjohn commented Aug 28, 2024

farhadrgh left a comment

Choose a reason for hiding this comment

jstjohn commented Aug 31, 2024

jstjohn commented Sep 3, 2024

ohadmo commented Sep 3, 2024

jstjohn commented Sep 3, 2024

jstjohn commented Sep 3, 2024

ohadmo commented Sep 3, 2024

jstjohn commented Sep 3, 2024

jstjohn commented Sep 4, 2024

jstjohn commented Sep 4, 2024

jstjohn commented Aug 14, 2024 •

edited

Loading