Skip to content

Commit

Permalink
add table mapping to code
Browse files Browse the repository at this point in the history
  • Loading branch information
Quentin-Anthony authored Sep 20, 2024
1 parent 915251f commit 6102669
Showing 1 changed file with 10 additions and 0 deletions.
10 changes: 10 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,16 @@ Each of the critical muP changes are marked with
```
to make everything easily searchable.

| Parameterization | SP | **μP** | Code |
|------------------|----|----|----|
| Embedding Init. Var. |_{base}^2$ |_{base}^2$ | |
| Embedding LR |_{base}$ |_{base}$ | |
| Embedding Fwd. | $x W_{\text{emb}}$ | $\mathbf{α_{input}} · x W_{\text{emb}}$ | [Code](https://github.com/EleutherAI/nanoGPT-mup/blob/bcadbc3c7a44138525eca8a799764afba7dca2b3/model.py#L208) |
| Hidden Init. Var. |_{base}^2$ |_{base}^2 / \mathbf{m_d}$ | [Code](https://github.com/EleutherAI/nanoGPT-mup/blob/bcadbc3c7a44138525eca8a799764afba7dca2b3/model.py#L163-L169) |
| Hidden LR (Adam) |_{base}$ |_{base} / \mathbf{m_d}$ | [Code](https://github.com/EleutherAI/nanoGPT-mup/blob/bcadbc3c7a44138525eca8a799764afba7dca2b3/model.py#L306-L329) |
| Output Logit Fwd. | $x W_{\text{emb}}^\top$ | $\mathbf{α_{output}} · x W_{\text{emb}}^\top / \mathbf{m_d}$ | [Code](https://github.com/EleutherAI/nanoGPT-mup/blob/bcadbc3c7a44138525eca8a799764afba7dca2b3/model.py#L219) |
| Attention logits | $Q^\top K / \sqrt{d_{\text{head}}}$ | $Q^\top K / \mathbf{d_{\text{head}}}$ | [Code](https://github.com/EleutherAI/nanoGPT-mup/blob/bcadbc3c7a44138525eca8a799764afba7dca2b3/model.py#L65) |


## Implementation Validation

Expand Down

0 comments on commit 6102669

Please sign in to comment.