Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix readme typos #1049

Merged
merged 4 commits into from
Sep 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ GPT-NeoX leverages many of the same features and technologies as the popular Meg

**[8/10/2023]** We have experimental support for LLaMA 2 and Flash Attention v2 supported in our [math-lm](https://github.com/EleutherAI/math-lm) project that will be upstreamed later this month.

**[5/17/2023]** After fixing some miscellenous bugs we now fully support bf16.
**[5/17/2023]** After fixing some miscellaneous bugs we now fully support bf16.

**[4/11/2023]** We have upgraded our Flash Attention implementation to now support Alibi positional embeddings.

Expand Down Expand Up @@ -125,17 +125,17 @@ With your environment properly set up and the correct configuration files you ca

`python3 deepy.py train.py /path/to/configs/my_model.yml`

#### SLURM
#### Slurm

Using SLURM can be slightly more involved. Like with MPI, you must add the following to your config:
Using Slurm can be slightly more involved. Like with MPI, you must add the following to your config:

```json
{
"launcher": "slurm",
"deepspeed_slurm": true
}
```
If you do not have ssh access to the compute nodes in your SLURM cluster you need to add `{"no_ssh_check": true}`
If you do not have ssh access to the compute nodes in your Slurm cluster you need to add `{"no_ssh_check": true}`

#### (Advanced) Custom Launching

Expand Down Expand Up @@ -175,7 +175,7 @@ do
done
```

`$SLURM_JOBID` and `$SLURM_NODELIST` being environment variables SLURM will create for you. See the [sbatch documentation](https://slurm.schedmd.com/sbatch.html#SECTION_OUTPUT-ENVIRONMENT-VARIABLES) for a full list of available Slurm environment variables set at job creation time.
`$SLURM_JOBID` and `$SLURM_NODELIST` being environment variables Slurm will create for you. See the [sbatch documentation](https://slurm.schedmd.com/sbatch.html#SECTION_OUTPUT-ENVIRONMENT-VARIABLES) for a full list of available Slurm environment variables set at job creation time.

#### Job Launching

Expand Down Expand Up @@ -505,7 +505,7 @@ Citation instructions for other pretrained models can be found [in the appropria
GPT-NeoX has been used by academic and industry researchers for a variety of high performance computing projects.

### Our Research
EleutherAI and our colaborators have used it in the following publications:
EleutherAI and our collaborators have used it in the following publications:
- Sid Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, McDonell, Jason Phang, Michael Pieler, Prashanth, Shivanshu Purohit, Laria Reynolds, Jon Tow, Ben Wang, and Samuel Weinbach. "[GPT-NeoX-20B: An Open-Source Autoregressive Language Model](https://arxiv.org/abs/2204.06745)." In *Proceedings of the ACL Workshop on Challenges \& Perspectives in Creating Large Language Models* (2022).
- Stella Biderman, Hailey Schoelkopf, Quentin Gregory Anthony, Herbie Bradley, Kyle O’Brien, Eric Hallahan, Mohammad Aflah Khan et al. "[Pythia: A suite for analyzing large language models across training and scaling](https://arxiv.org/abs/2304.01373)." In _International Conference on Machine Learning_, pp. 2397-2430. PMLR (2023).
- Zhangir Azerbayev, Bartosz Piotrowski, Hailey Schoelkopf, Edward W. Ayers, Dragomir Radev, and Jeremy Avigad. "[Proofnet: Autoformalizing and formally proving undergraduate-level mathematics](https://arxiv.org/abs/2302.12433). *arXiv preprint arXiv:2302.12433* (2023).
Expand Down Expand Up @@ -578,4 +578,4 @@ For full terms, see the `LICENSE` file. If you have any questions, comments, or

## Acknowledgements

We run our experiments on a Kubernetes cluster provided by [CoreWeave](https://coreweave.com/) and a SLURM cluster provided by [Stability AI](https://stability.ai). We are thankful to the DeepSpeed team for their advice and consultation.
We run our experiments on a Kubernetes cluster provided by [CoreWeave](https://coreweave.com/) and a Slurm cluster provided by [Stability AI](https://stability.ai). We are thankful to the DeepSpeed team for their advice and consultation.
2 changes: 1 addition & 1 deletion configs/neox_arguments.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ Logging Arguments

- **git_hash**: str

Default = 5c4a452
Default = 61f2554

current git hash of repository

Expand Down