diff --git a/.gitignore b/.gitignore index a0ca125e..fe0af372 100644 --- a/.gitignore +++ b/.gitignore @@ -27,11 +27,12 @@ test_* !test_*.cu demo chat +voicechat profile_* !profile_*.cc libtorch/ -transformer/chat -transformer/output.wav -transformer/tmpfile -transformer/TTS \ No newline at end of file +llm/chat +llm/output.wav +llm/tmpfile +llm/TTS \ No newline at end of file diff --git a/README.md b/README.md index 2ee04d4e..657490df 100644 --- a/README.md +++ b/README.md @@ -34,9 +34,9 @@ Feel free to check out our [slides](assets/slides.pdf) for more details! ## News +- **(2024/01)** 🔥We released TinyVoiceChat, a voice chatbot that can be deployed on your edge devices, such as MacBook and Jetson Orin Nano. Check out our [demo video](https://youtu.be/Bw5Dm3aWMnA?si=CCvZDmq3HwowEQcC) and follow the [instructions](#deploy-speech-to-speech-chatbot-with-tinychatengine) to deploy it on your device! - **(2023/10)** We extended the support for the coding assistant [Code Llama](#download-and-deploy-models-from-our-model-zoo). Feel free to check out. - **(2023/10)** ⚡We released the new CUDA backend to support Nvidia GPUs with compute capability >= 6.1 for both server and edge GPUs. Its performance is also speeded up by ~40% compared to the previous version. Feel free to check out! -- **(2023/09)** 🔥We released TinyVoiceChat, a voice chatbot that can be deployed on your edge devices, such as MacBook and Jetson Orin Nano. Check out our [demo video](https://youtu.be/Bw5Dm3aWMnA?si=CCvZDmq3HwowEQcC) and [step-by-step guide](llm/application/README.md) to deploy it on your device! ## Prerequisites @@ -132,6 +132,27 @@ Here, we provide step-by-step instructions to deploy LLaMA2-7B-chat with TinyCha ... ``` + +## Deploy speech-to-speech chatbot with TinyChatEngine [[Demo]](https://youtu.be/Bw5Dm3aWMnA?si=CCvZDmq3HwowEQcC) + +TinyChatEngine offers versatile capabilities suitable for various applications. Additionally, we introduce a sophisticated voice chatbot. Here, we provide very easy-to-follow instructions to deploy speech-to-speech chatbot (LLaMA2-7B-chat) with TinyChatEngine. + +- Follow the instructions above to setup the basic environment, i.e., [Prerequisites](#prerequisites) and [Step-by-step to Deploy LLaMA2-7B-chat with TinyChatEngine](#step-by-step-to-deploy-llama2-7b-chat-with-tinychatengine). + +- Run the shell script to set up the environment for speech-to-speech chatbot. + ```bash + cd llm + ./voicechat_setup.sh + ``` + +- Start the speech-to-speech chat locally. + ```bash + ./chat -v # chat.exe -v on Windows + ``` + +- If you encounter any issues or errors during setup, please explore [here](llm/application/README.md) to follow the step-by-step guide to debug. + + ## Backend Support | Precision | x86
(Intel/AMD CPU) | ARM
(Apple M1/M2 & RPi) | Nvidia GPU | Apple GPU | @@ -364,12 +385,6 @@ make chat -j ``` -## Experimental Features - -### Voice Chatbot [[Demo]](https://youtu.be/Bw5Dm3aWMnA?si=CCvZDmq3HwowEQcC) - -TinyChatEngine offers versatile capabilities suitable for various applications. Additionally, we introduce a sophisticated voice chatbot. Explore our step-by-step guide [here](llm/application/README.md) to seamlessly deploy a speech-to-speech chatbot locally on your device! - ## Related Projects [TinyEngine: Memory-efficient and High-performance Neural Network Library for Microcontrollers](https://github.com/mit-han-lab/tinyengine) @@ -378,6 +393,7 @@ TinyChatEngine offers versatile capabilities suitable for various applications. [AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration](https://github.com/mit-han-lab/llm-awq) + ## Acknowledgement [llama.cpp](https://github.com/ggerganov/llama.cpp) diff --git a/llm/voicechat_setup.sh b/llm/voicechat_setup.sh new file mode 100755 index 00000000..6ab9263d --- /dev/null +++ b/llm/voicechat_setup.sh @@ -0,0 +1,54 @@ +#!/bin/bash + +# Clone whisper.cpp and checkout the specific commit +git clone https://github.com/ggerganov/whisper.cpp +cd whisper.cpp +git checkout a4bb2df + +# Determine the platform +OS="$(uname)" +if [ "$OS" = "Linux" ]; then + # Install SDL2 on Linux + sudo apt-get install libsdl2-dev +elif [ "$OS" = "Darwin" ]; then + # Install SDL2 on Mac OS + brew install sdl2 +else + echo "Unsupported operating system: $OS" + exit 1 +fi + +# Apply patch and download model +git apply ../application/sts_utils/clean_up.patch +bash ./models/download-ggml-model.sh base.en + +# Check for NVIDIA GPU +if lspci | grep -i nvidia > /dev/null; then + # Compile with CUDA support + WHISPER_CUBLAS=1 make -j stream +else + # Compile without CUDA support + make -j stream +fi + +# Set up TTS +cd ../ +mkdir TTS +cd TTS +wget "https://github.com/rhasspy/piper/releases/download/v1.2.0/piper_arm64.tar.gz" +tar -xvzf piper_arm64.tar.gz +rm piper_arm64.tar.gz + +# Download default voice +wget "https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/amy/medium/en_US-amy-medium.onnx?download=true" -O en_US-amy-medium.onnx +wget "https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/amy/medium/en_US-amy-medium.onnx.json?download=true" -O en_US-amy-medium.onnx.json + +# Return to the parent directory and compile chat +cd ../ +make clean +make -j chat + +echo "" +echo "TinyChatEngine's speech-to-speech chatbot setup completed successfully!" +echo "Use './chat -v' on Linux/MacOS or 'chat.exe -v' on Windows." +echo ""