diff --git a/.gitignore b/.gitignore
index a0ca125e..fe0af372 100644
--- a/.gitignore
+++ b/.gitignore
@@ -27,11 +27,12 @@ test_*
!test_*.cu
demo
chat
+voicechat
profile_*
!profile_*.cc
libtorch/
-transformer/chat
-transformer/output.wav
-transformer/tmpfile
-transformer/TTS
\ No newline at end of file
+llm/chat
+llm/output.wav
+llm/tmpfile
+llm/TTS
\ No newline at end of file
diff --git a/README.md b/README.md
index 2ee04d4e..657490df 100644
--- a/README.md
+++ b/README.md
@@ -34,9 +34,9 @@ Feel free to check out our [slides](assets/slides.pdf) for more details!
## News
+- **(2024/01)** 🔥We released TinyVoiceChat, a voice chatbot that can be deployed on your edge devices, such as MacBook and Jetson Orin Nano. Check out our [demo video](https://youtu.be/Bw5Dm3aWMnA?si=CCvZDmq3HwowEQcC) and follow the [instructions](#deploy-speech-to-speech-chatbot-with-tinychatengine) to deploy it on your device!
- **(2023/10)** We extended the support for the coding assistant [Code Llama](#download-and-deploy-models-from-our-model-zoo). Feel free to check out.
- **(2023/10)** âš¡We released the new CUDA backend to support Nvidia GPUs with compute capability >= 6.1 for both server and edge GPUs. Its performance is also speeded up by ~40% compared to the previous version. Feel free to check out!
-- **(2023/09)** 🔥We released TinyVoiceChat, a voice chatbot that can be deployed on your edge devices, such as MacBook and Jetson Orin Nano. Check out our [demo video](https://youtu.be/Bw5Dm3aWMnA?si=CCvZDmq3HwowEQcC) and [step-by-step guide](llm/application/README.md) to deploy it on your device!
## Prerequisites
@@ -132,6 +132,27 @@ Here, we provide step-by-step instructions to deploy LLaMA2-7B-chat with TinyCha
...
```
+
+## Deploy speech-to-speech chatbot with TinyChatEngine [[Demo]](https://youtu.be/Bw5Dm3aWMnA?si=CCvZDmq3HwowEQcC)
+
+TinyChatEngine offers versatile capabilities suitable for various applications. Additionally, we introduce a sophisticated voice chatbot. Here, we provide very easy-to-follow instructions to deploy speech-to-speech chatbot (LLaMA2-7B-chat) with TinyChatEngine.
+
+- Follow the instructions above to setup the basic environment, i.e., [Prerequisites](#prerequisites) and [Step-by-step to Deploy LLaMA2-7B-chat with TinyChatEngine](#step-by-step-to-deploy-llama2-7b-chat-with-tinychatengine).
+
+- Run the shell script to set up the environment for speech-to-speech chatbot.
+ ```bash
+ cd llm
+ ./voicechat_setup.sh
+ ```
+
+- Start the speech-to-speech chat locally.
+ ```bash
+ ./chat -v # chat.exe -v on Windows
+ ```
+
+- If you encounter any issues or errors during setup, please explore [here](llm/application/README.md) to follow the step-by-step guide to debug.
+
+
## Backend Support
| Precision | x86
(Intel/AMD CPU) | ARM
(Apple M1/M2 & RPi) | Nvidia GPU | Apple GPU |
@@ -364,12 +385,6 @@ make chat -j
```
-## Experimental Features
-
-### Voice Chatbot [[Demo]](https://youtu.be/Bw5Dm3aWMnA?si=CCvZDmq3HwowEQcC)
-
-TinyChatEngine offers versatile capabilities suitable for various applications. Additionally, we introduce a sophisticated voice chatbot. Explore our step-by-step guide [here](llm/application/README.md) to seamlessly deploy a speech-to-speech chatbot locally on your device!
-
## Related Projects
[TinyEngine: Memory-efficient and High-performance Neural Network Library for Microcontrollers](https://github.com/mit-han-lab/tinyengine)
@@ -378,6 +393,7 @@ TinyChatEngine offers versatile capabilities suitable for various applications.
[AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration](https://github.com/mit-han-lab/llm-awq)
+
## Acknowledgement
[llama.cpp](https://github.com/ggerganov/llama.cpp)
diff --git a/llm/voicechat_setup.sh b/llm/voicechat_setup.sh
new file mode 100755
index 00000000..6ab9263d
--- /dev/null
+++ b/llm/voicechat_setup.sh
@@ -0,0 +1,54 @@
+#!/bin/bash
+
+# Clone whisper.cpp and checkout the specific commit
+git clone https://github.com/ggerganov/whisper.cpp
+cd whisper.cpp
+git checkout a4bb2df
+
+# Determine the platform
+OS="$(uname)"
+if [ "$OS" = "Linux" ]; then
+ # Install SDL2 on Linux
+ sudo apt-get install libsdl2-dev
+elif [ "$OS" = "Darwin" ]; then
+ # Install SDL2 on Mac OS
+ brew install sdl2
+else
+ echo "Unsupported operating system: $OS"
+ exit 1
+fi
+
+# Apply patch and download model
+git apply ../application/sts_utils/clean_up.patch
+bash ./models/download-ggml-model.sh base.en
+
+# Check for NVIDIA GPU
+if lspci | grep -i nvidia > /dev/null; then
+ # Compile with CUDA support
+ WHISPER_CUBLAS=1 make -j stream
+else
+ # Compile without CUDA support
+ make -j stream
+fi
+
+# Set up TTS
+cd ../
+mkdir TTS
+cd TTS
+wget "https://github.com/rhasspy/piper/releases/download/v1.2.0/piper_arm64.tar.gz"
+tar -xvzf piper_arm64.tar.gz
+rm piper_arm64.tar.gz
+
+# Download default voice
+wget "https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/amy/medium/en_US-amy-medium.onnx?download=true" -O en_US-amy-medium.onnx
+wget "https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/amy/medium/en_US-amy-medium.onnx.json?download=true" -O en_US-amy-medium.onnx.json
+
+# Return to the parent directory and compile chat
+cd ../
+make clean
+make -j chat
+
+echo ""
+echo "TinyChatEngine's speech-to-speech chatbot setup completed successfully!"
+echo "Use './chat -v' on Linux/MacOS or 'chat.exe -v' on Windows."
+echo ""