Skip to content

Repository for the code of a KubeCon NA 2023 talk about how to train Llama2 on Argo Workflows via Hera

License

Notifications You must be signed in to change notification settings

flaviuvadan/kubecon_na_23_llama2_finetune

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KubeCon NA 2023 - How to fine tune an LLM with Argo Workflows and Hera

Introduction

This repository contains the code and instructions to reproduce the demo presented in the talk titled "How to fine tune an LLM with Argo Workflows and Hera" KubeCon NA 2023, by Flaviu Vadan and JP Zivalich.

Talk

Prerequisites

  • Python 3.10.13
  • Poetry 1.6.1
  • Docker 20+

Installation

  1. Clone the repository
  2. Install the dependencies with poetry shell && poetry install
  3. Set up the following environment variables:
    1. ARGO_HOST - host of your Argo Workflows server
    2. ARGO_TOKEN - token to authenticate with the Argo Workflows server
    3. ARGO_NAMESPACE - namespace to submit finetuning workflow to
    4. HF_TOKEN - HuggingFace authentication token
  4. Run python src/talk/finetune.py to submit the core finetuning workflow

Structure

├── src
│   ├── talk
│       ├── etcd  # provides resources to create, wait for, and delete the etcd load balancer and replica set
│       ├── finetune  # provides the main Python command for finetuning Llama2 using llama-recipes 
│       ├── ssd  # provides the storage class definition for SSD storage, which is used by etcd 
│       ├── train  # provides the core training workflow that sets up the training containers via Torch, etcd, etc.
│       ├── workflows  # light wrapper around Hera to add labels, set up auth, add GPU tolerations automatically, etc. 

License

This repository observes, follows, and presents the Llama 2 community license agreement (the "license"), Llama 2 Version Release Date: July 18, 2023. Any use of this repository is subject to the terms and conditions of the license.

About

Repository for the code of a KubeCon NA 2023 talk about how to train Llama2 on Argo Workflows via Hera

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published