This example uses ONNX Runtime Training to fine-tune the GPT2 PyTorch model maintained at https://github.com/huggingface/transformers.
You can run the training in Azure Machine Learning or in other environments.
-
Clone this repo
git clone https://github.com/microsoft/onnxruntime-training-examples.git cd onnxruntime-training-examples/huggingface-gpt2
-
Clone download code and model from the HuggingFace repo
git clone https://github.com/huggingface/transformers.git cd transformers/ git checkout 9a0a8c1c6f4f2f0c80ff07d36713a3ada785eec5
-
Update with ORT changes
git apply ../ort_addon/src_changes.patch cp -r ../ort_addon/ort_supplement/* ./ cd ..
-
Build the Docker image
Install the dependencies of the transformer examples and modified transformers into the base ORT Docker image.
docker build --network=host -f docker/Dockerfile . --rm --pull -t onnxruntime-gpt
The following are a minimal set of instructions to download one of the datasets used for GPT2 fine-tuning for the language modeling task.
Download the word-level dataset WikiText-103 for this sample. Refer to the readme at transformers for additional details.
Download the data and export path as $DATA_DIR:
export DATA_DIR=/path/to/downloaded/data/
- TRAIN_FILE:
$DATA_DIR/wiki.train.tokens
- TEST_FILE:
$DATA_DIR/wiki.test.tokens
-
Data Transfer
- Transfer training data to Azure blob storage
To transfer the data to an Azure blob storage using Azure CLI, use command:
az storage blob upload-batch --account-name <storage-name> -d <container-name> -s $DATA_DIR
You can also use azcopy or Azure Storage Explorer to copy data. We recommend that you download the data in the training environment itself or in an environment from where data transfer to training environment will be fast and efficient.
- Register the blob container as a data store
- Mount the data store in the compute targets used for training
Please refer to the storage guidance for details on using Azure storage account for training in Azure Machine Learning.
-
Prepare the docker image for AML
Follow the instructions in setup to build a docker image with the required dependencies installed.
- Push the image to a container registry. You can find additional details about tagging the image and pushing to an Azure Container Registry.
-
Execute fine-tuning
The GPT2 fine-tuning job in Azure Machine Learning can be launched using either of these environments:
- Azure Machine Learning Compute Instance to run the Jupyter notebook.
- Azure Machine Learning SDK
You will need a GPU optimized compute target - either NCv3 or NDv2 series, to execute this fine-tuning job.
Execute the steps in the Python notebook azureml-notebooks/run-finetuning.ipynb within your environment. If you have a local setup to run an Azure ML notebook, you could run the steps in the notebook in that environment. Otherwise, a compute instance in Azure Machine Learning could be created and used to run the steps.
We recommend running this sample on a system with at least one NVIDIA GPU.
-
Check pre-requisites
- CUDA 10.1
- Docker
-
Build the docker image
Follow the instructions in setup to build a docker image with the required dependencies installed.
The base Docker image used is
mcr.microsoft.com/azureml/onnxruntime-training
. The Docker image is tested in AzureML environment. For running the examples in other environments, building a new base Docker image may be necessary by following the directions in the nvidia-bert sample.To build and install the onnxruntime wheel on the host machine, follow steps here
-
Set correct paths to training data for docker image
Edit
docker/launch.sh
.... DATA_DIR=<replace-with-path-to-training-data> ...
The directory must contain the training and validation files.
-
Set the number of GPUs
Edit
transformers/scripts/run_lm_gpt2.sh
.num_gpus=4
-
Modify other training parameters as needed
Edit
transformers/scripts/run_lm_gpt2.sh
.--model_type=gpt2 --model_name_or_path=gpt2 --tokenizer_name=gpt2 --config_name=gpt2 --per_gpu_train_batch_size=1 --per_gpu_eval_batch_size=4 --gradient_accumulation_steps=16 --block_size=1024 --weight_decay=0.01 --logging_steps=100 --num_train_epochs=5
Consult the huggingface transformers training_args for additional details.
-
Launch interactive container
bash docker/launch.sh
-
Launch the fine-tuning run
bash /workspace/transformers/scripts/run_lm_gpt2.sh
If you get memory errors, try reducing the batch size. You can find the recommended batch sizes for ORT here. If the flags enabling evaluation and the evaluation data file are passed, the training is followed by evaluation and the perplexity is printed.