Skip to content

Commit

Permalink
Re-organize the folder
Browse files Browse the repository at this point in the history
Co-authored-by: Stella Biderman <[email protected]>
Signed-off-by: Dashiell Stander <[email protected]>
  • Loading branch information
dashstander and Stella Biderman committed Oct 2, 2023
1 parent 7a8569f commit 94cc945
Show file tree
Hide file tree
Showing 20 changed files with 286 additions and 193 deletions.
2 changes: 1 addition & 1 deletion prepare_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

from tools.corpora import prepare_dataset, DATA_DOWNLOADERS
from tools.datasets.corpora import prepare_dataset, DATA_DOWNLOADERS
import argparse

TOKENIZER_CHOICES = [
Expand Down
15 changes: 15 additions & 0 deletions tools/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# GPT-NeoX Auxiliary Tools

This directory contains a number of auxiliary tools that are useful for working with GPT-NeoX but not part of the main training code.

## Bash

This directory contains some simple, frequently used bash commands to make working on multiple machines easier.

## Checkpoints

This directory contains tools for manipulating and converting checkpoints including changing the parallelism settings of a pretrained model, converting between GPT-NeoX and the transformers library, and updating checkpoints trained with Version 1.x of this library to be compatible with Version 2.x.

## Datasets

This directory contains tools for downloading and preprocessing datasets to the format expected by the GPT-NeoX library.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Loading

0 comments on commit 94cc945

Please sign in to comment.