-
Notifications
You must be signed in to change notification settings - Fork 30
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Build docker files for both CI and User (#219)
* add docker start for user * add docker start for user * merge docker file * fix * fix deepspeed * fix * fix * fix * add ray user * add git * add git * fix * fix * fix * fix * fix * fix * fix reademe * fix dockerfile path * fix re * Update README.md Signed-off-by: Xiaochang Wu <[email protected]> * Update README.md Signed-off-by: Xiaochang Wu <[email protected]> * fix * fix docker file * fix docker file * fix * fix * fix review * fix md * fix md * fix md * fix md * fix rebase * fix rebase * fix rebase * fix rebase * fix --------- Signed-off-by: Xiaochang Wu <[email protected]> Co-authored-by: Xiaochang Wu <[email protected]>
- Loading branch information
1 parent
4a646b0
commit cf5d32b
Showing
16 changed files
with
260 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
# syntax=docker/dockerfile:1 | ||
FROM ubuntu:22.04 | ||
|
||
# Define build arguments | ||
ARG DOCKER_NAME=default | ||
ARG PYPJ=default | ||
ENV LANG C.UTF-8 | ||
|
||
WORKDIR /root/ | ||
|
||
RUN --mount=type=cache,target=/var/cache/apt apt-get update -y \ | ||
&& apt-get install -y build-essential cmake wget curl git vim htop ssh net-tools \ | ||
&& apt-get clean \ | ||
&& rm -rf /var/lib/apt/lists/* | ||
|
||
ENV CONDA_DIR /opt/conda | ||
RUN wget --quiet https://github.com/conda-forge/miniforge/releases/download/23.3.1-1/Miniforge3-Linux-x86_64.sh -O ~/miniforge.sh && \ | ||
/bin/bash ~/miniforge.sh -b -p /opt/conda | ||
ENV PATH $CONDA_DIR/bin:$PATH | ||
|
||
# setup env | ||
SHELL ["/bin/bash", "--login", "-c"] | ||
|
||
RUN --mount=type=cache,target=/opt/conda/pkgs conda init bash && \ | ||
unset -f conda && \ | ||
export PATH=$CONDA_DIR/bin/:${PATH} && \ | ||
mamba config --add channels intel && \ | ||
mamba install -y -c conda-forge python==3.9 gxx=12.3 gxx_linux-64=12.3 libxcrypt | ||
|
||
# Used to invalidate docker build cache with --build-arg CACHEBUST=$(date +%s) | ||
ARG CACHEBUST=1 | ||
|
||
RUN git clone https://github.com/intel/llm-on-ray.git | ||
RUN if [ -d "llm-on-ray" ]; then echo "Clone successful"; else echo "Clone failed" && exit 1; fi | ||
WORKDIR /root/llm-on-ray | ||
|
||
|
||
RUN ls -la | ||
|
||
RUN --mount=type=cache,target=/root/.cache/pip pip install -e .[${PYPJ}] --extra-index-url https://download.pytorch.org/whl/cpu \ | ||
--extra-index-url https://pytorch-extension.intel.com/release-whl/stable/cpu/us/ | ||
|
||
# Use shell scripting to conditionally install packages | ||
RUN if [ "${DOCKER_NAME}" = ".cpu_and_deepspeed" ]; then ds_report && ./dev/scripts/install-oneapi.sh;fi | ||
RUN if [ "${DOCKER_NAME}" = ".ipex-llm" ]; then ./dev/scripts/install-oneapi.sh; fi | ||
RUN if [ "${DOCKER_NAME}" = ".vllm" ]; then ./dev/scripts/install-vllm-cpu.sh; fi | ||
|
||
|
||
ENTRYPOINT ["sh", "./dev/scripts/entrypoint_user.sh"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,6 @@ | ||
Dockerfiles for CI tests. There could be one Dockerfile with ARG declared to distinguish different pip extras. However, ARG will bust cache of 'pip install', which usually takes long time, when build docker image. Instead, we have two almost identical Dockerfiles here to improve CI efficiency. | ||
# Dockerfiles for Users | ||
|
||
* `Dockerfile.user` to build llm-on-ray docker image for running on Intel CPU. | ||
* `Dockerfile.habana` to build llm-on-ray docker image for running on [Intel Gaudi AI accelerator](https://habana.ai/products/gaudi/). | ||
|
||
__NOTICE:__ Dockerfiles in `ci/` are for CI tests only and not intended for daily use. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
FROM vault.habana.ai/gaudi-docker/1.15.1/ubuntu22.04/habanalabs/pytorch-installer-2.2.0:latest | ||
|
||
ENV LANG=en_US.UTF-8 | ||
|
||
WORKDIR /root/llm-on-ray | ||
|
||
COPY ./pyproject.toml . | ||
COPY ./MANIFEST.in . | ||
|
||
# create llm_on_ray package directory to bypass the following 'pip install -e' command | ||
RUN mkdir ./llm_on_ray | ||
|
||
RUN pip install -e . && \ | ||
pip install --upgrade-strategy eager optimum[habana] && \ | ||
pip install git+https://github.com/HabanaAI/[email protected] | ||
|
||
# Optinal. Comment out if you are not using UI | ||
COPY ./dev/scripts/install-ui.sh /tmp | ||
|
||
RUN /tmp/install-ui.sh | ||
|
||
RUN sed -i 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config && \ | ||
service ssh restart | ||
|
||
ENV no_proxy=localhost,127.0.0.1 | ||
|
||
# Required by DeepSpeed | ||
ENV RAY_EXPERIMENTAL_NOSET_HABANA_VISIBLE_MODULES=1 | ||
|
||
ENV PT_HPU_LAZY_ACC_PAR_MODE=0 | ||
|
||
ENV PT_HPU_ENABLE_LAZY_COLLECTIVES=true |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
#!/bin/bash | ||
set -eo pipefail | ||
|
||
# If your model needs HF_TOKEN. Please modify the "model_description.config.use_auth_token" in the config file such as "llm_on_ray/inference/models/llama-2-7b-chat-hf.yaml" | ||
# Mount your own llm-on-ray directory here | ||
code_checkout_path=$PWD | ||
# Mount your own huggingface cache path here | ||
model_cache_path=$HOME'/.cache/huggingface/hub' | ||
MODEL_CACHE_PATH_LOACL='/root/.cache/huggingface/hub' | ||
CODE_CHECKOUT_PATH_LOCAL='/root/llm-on-ray' | ||
|
||
|
||
build_docker() { | ||
local DOCKER_NAME=$1 | ||
|
||
docker_args=() | ||
docker_args+=("--build-arg=CACHEBUST=1") | ||
if [ "$DOCKER_NAME" == "vllm" ]; then | ||
docker_args+=("--build-arg=DOCKER_NAME=".vllm"") | ||
docker_args+=("--build-arg=PYPJ="vllm"") | ||
elif [ "$DOCKER_NAME" == "ipex-llm" ]; then | ||
docker_args+=("--build-arg=DOCKER_NAME=".ipex-llm"") | ||
docker_args+=("--build-arg=PYPJ="ipex-llm"") | ||
else | ||
docker_args+=("--build-arg=DOCKER_NAME=".cpu_and_deepspeed"") | ||
docker_args+=("--build-arg=PYPJ="cpu,deepspeed"") | ||
fi | ||
|
||
if [ -n "$http_proxy" ]; then | ||
docker_args+=("--build-arg=http_proxy=$http_proxy") | ||
fi | ||
|
||
if [ -n "$https_proxy" ]; then | ||
docker_args+=("--build-arg=https_proxy=$http_proxy") | ||
fi | ||
|
||
|
||
echo "Build Docker image and perform cleaning operation" | ||
echo "docker build ./ ${docker_args[@]} -f dev/docker/Dockerfile.user -t serving${DOCKER_NAME}:latest" | ||
|
||
# Build Docker image and perform cleaning operation | ||
docker build ./ "${docker_args[@]}" -f dev/docker/Dockerfile.user -t serving${DOCKER_NAME}:latest | ||
|
||
} | ||
|
||
start_docker() { | ||
local DOCKER_NAME=$1 | ||
local MODEL_NAME=$2 | ||
|
||
docker_args=() | ||
docker_args+=("--name=serving${DOCKER_NAME}" ) | ||
if [ -z "$MODEL_NAME" ]; then | ||
echo "use default model" | ||
else | ||
docker_args+=("-e=model_name=${MODEL_NAME}") | ||
fi | ||
|
||
if [ -n "$http_proxy" ]; then | ||
docker_args+=("-e=http_proxy=$http_proxy") | ||
fi | ||
|
||
if [ -n "$https_proxy" ]; then | ||
docker_args+=("-e=https_proxy=$http_proxy") | ||
fi | ||
|
||
docker_args+=("-e=OPENAI_BASE_URL=${OPENAI_BASE_URL:-http://localhost:8000/v1}") | ||
docker_args+=("-e=OPENAI_API_KEY=${OPENAI_API_KEY:-not_a_real_key}") | ||
|
||
# # If you need to use the modified llm-on-ray repository or huggingface model cache, activate the corresponding row | ||
docker_args+=("-v=$code_checkout_path:${CODE_CHECKOUT_PATH_LOCAL}") | ||
docker_args+=("-v=${model_cache_path}:${MODEL_CACHE_PATH_LOACL}") | ||
|
||
echo "docker run -ti ${docker_args[@]} serving${DOCKER_NAME}:latest" | ||
docker run -ti ${docker_args[@]} serving${DOCKER_NAME}:latest | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
#!/bin/bash | ||
set -e | ||
|
||
# Default serve cmd | ||
if ! pgrep -f 'ray'; then | ||
echo "Ray is not running. Starting Ray..." | ||
# start Ray | ||
ray start --head | ||
echo "Ray started." | ||
else | ||
echo "Ray is already running." | ||
fi | ||
# Prepare for openai related | ||
pip install openai>=1.0 | ||
|
||
if [ -n "$model_name" ]; then | ||
echo "Using User Model: $model_name" | ||
llm_on_ray-serve --models $model_name --keep_serve_terminal | ||
else | ||
echo "Using Default Model: gpt2" | ||
llm_on_ray-serve --config_file llm_on_ray/inference/models/gpt2.yaml --keep_serve_terminal | ||
fi | ||
|
||
exec /bin/bash |