Skip to content

Commit

Permalink
add main correctness plots to readme
Browse files Browse the repository at this point in the history
  • Loading branch information
Quentin-Anthony committed Sep 19, 2024
1 parent 394547d commit 3756e8f
Show file tree
Hide file tree
Showing 5 changed files with 23 additions and 1 deletion.
24 changes: 23 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

# nanoGPT-mup

This repository is a fork of [nanoGPT](https://github.com/karpathy/nanoGPT) that provides a minimal implementation of the [maximal update parameterization](https://arxiv.org/abs/2203.03466) ([muP](https://github.com/microsoft/mup)) and acts as supplementary material for ["The Practitioner’s Guide to the Maximal Update Parameterization"](https://www.cerebras.ai/blog/the-practitioners-guide-to-the-maximal-update-parameterization). The `mup_examples` folder contains scripts to reproduce the plots in the blog post (see `mup_examples/README.md` for instructions to reproduce).
This repository is a fork of [nanoGPT](https://github.com/karpathy/nanoGPT) that provides a minimal implementation of the [maximal update parameterization](https://arxiv.org/abs/2203.03466) ([muP](https://github.com/microsoft/mup)) and acts as supplementary material for ["The Practitioner’s Guide to the Maximal Update Parameterization"](https://www.cerebras.ai/blog/the-practitioners-guide-to-the-maximal-update-parameterization). The [mup_examples](https://github.com/EleutherAI/nanoGPT-mup/tree/master/mup_examples) folder contains scripts to reproduce the plots in the blog post (see [mup_examples/README.md](https://github.com/EleutherAI/nanoGPT-mup/blob/master/mup_examples/README.md) for instructions to reproduce).

Each of the critical muP changes are marked with
```
Expand All @@ -12,6 +12,28 @@ Each of the critical muP changes are marked with
to make everything easily searchable.


## Implementation Validation

We've verified the correctness of this implementation via coordinate checks and LR transfer.

### Coordinate Checks

Standard Parameterization:

<img src="assets/coord_check_sp.png" alt="SP" width="60%">

muTransfer:

<img src="assets/coord_check_mup.png" alt="muP" width="60%">


### Learning Rate muTransfer

**Tiny Shakespeare** | **OpenWebText**
:-------------------------:|:-------------------------:
<img src="assets/mutransfer_lr_shakespeare_char.png" alt="mup-shakespeare" width="100%"> | <img src="assets/mutransfer_lr_owt.png" alt="mup-owt" width="100%">


If ["The Practitioner’s Guide to the Maximal Update Parameterization"](https://www.cerebras.ai/blog/the-practitioners-guide-to-the-maximal-update-parameterization) or this repository was useful to you, please cite:
```
@misc{cerebras2024mupguide,
Expand Down
Binary file added assets/coord_check_mup.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/coord_check_sp.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/mutransfer_lr_owt.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/mutransfer_lr_shakespeare_char.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 3756e8f

Please sign in to comment.