Llama.cui is a small llama.cpp-based chat application for Node.js

This project provides a Node.js server for a chat user interface (UI) that interacts with the Llama.cpp library. It allows users to communicate with the llama.cpp application via a web-based chat interface.

Install all at once

Starting from version 0.24 model will be downloaded automatically.

Linux Nvidia GPU

git clone https://github.com/ggerganov/llama.cpp.git; cd llama.cpp; sed -i 's/-arch=native/-arch=all/g' Makefile; make clean && LLAMA_CUDA=1 make -j 6; cd ..; git clone https://github.com/dspasyuk/llama.cui; cd llama.cui; npm install; node server.js

Linux CPU

git clone https://github.com/ggerganov/llama.cpp.git; cd llama.cpp; sed -i 's/-arch=native/-arch=all/g' Makefile; make clean && make -j 6; cd ..; git clone https://github.com/dspasyuk/llama.cui; cd llama.cui; npm install; node server.js

OSX

git clone https://github.com/ggerganov/llama.cpp.git; cd llama.cpp; sed -i 's/-arch=native/-arch=all/g' Makefile; make clean && make -j 6; cd ..; git clone https://github.com/dspasyuk/llama.cui; cd llama.cui; npm install; node server.js

Change "--n-gpu-layers" in config.js file depending on the type of architecture used and available VRAM. For the default model (Llama3-instruct) this should be equal to 35, for compatibility it is currently set to 25, you will need at least 6Gb of VRAM to run the model, so Nvidia GTX1060 and above is a must.

Manual Installation

Clone the repository:

git clone https://github.com/ggerganov/llama.cpp.git
Build Lllama.cpp with GPU or CPU support

cd llama.cpp

sed -i 's/-arch=native/-arch=all/g' Makefile # could be skipped if native arch works

make clean && LLAMA_CUDA=1 make -j 4 # for GPU CUDA version make clean && LLAMA_CUBLAS=1 make -j 4 # for GPU cuBLAS version

or

make # for CPU version
Clone llama.cui

git clone https://github.com/dspasyuk/llama.cui
Download LLM model from hugging face in GGUF format, for example:

a. Meta-Llama-3-8B-Instruct: https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF
or https://huggingface.co/dspasyuk/Meta-Llama-3-8B-Instruct-Q5_K_S-GGUF/blob/main/Meta-Llama-3-8B-Instruct-Q5_K_S.gguf

b. Dolphin-Mistral 7B: https://huggingface.co/TheBloke/dolphin-2.1-mistral-7B-GGUF/blob/main/dolphin-2.1-mistral-7b.Q5_0.gguf

c. Einstein-v4-7B: https://huggingface.co/LoneStriker/Einstein-v4-7B-GGUF

d. Qwen2-7B-Instruct Models: https://huggingface.co/Qwen/Qwen2-7B-Instruct-GGUF/tree/main (will need flash attentioon enabled in config.js e.g. -fa) (Default)
Install the project and set your configuration parameters

`cd llama.cui

npm install`

Open config.js and change the hostname, port, path to llama.cpp main file, and the model name/path

Usage

To run just type:

npm start

Login Information

Default login and password are specified in the config file but could be easily integrated with the user database. The login is currently set to false. To enable login set login to true in the config file and change password.

Piper integration

As of version 1.15 the llama.cui supports Piper for a text-to-voice generation. Enable it in config.js, make sure to install Piper before running llama.cui

Linux

Getting Piper

  git clone https://github.com/rhasspy/piper.git
  cd piper  
  make  
  That should build Piper and put it in "piper/install/"

Downloading voice models

  Models can be found on Hugging Face:  
  https://huggingface.co/rhasspy/piper-voices  
  Default Llama.cui voice model is librits/en_US-libritts_r-medium.onnx"  
  https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/libritts/high

Configure Piper

  // The config below is already a default in config.js. You will only need to set "enabled" to true in piper config
  config.piper = {  
    enabled: true,  
    rate: 20500, // depends on your model  
    output_file: 'S16_LE', //Piper outputs 16-bit mono PCM buffers so keep this value as is  
    exec: "../../piper/install/piper", // set a path to your piper installation  
    model: "/home/denis/CODE/piper/models/librits/en_US-libritts_r-medium.onnx"  // set a path to your voice models  
  };

MacOS

In addition to the regular Linux instructions on Mac other configurations must be performed to install Piper.

Try piper_install_mac.sh installation script first:

 `bash piper_install_mac.sh`

Example of usage

echo 'Welcome to the world of speech synthesis!' | "$PIPER_ROOT_FOLDER/piper/install/piper" \ --model "$PIPER_ROOT_FOLDER/models/librits/en_US-libritts-high.onnx" \ --output-file welcome.wav

 If it fails at any stage try the guide below:

Manual piper installation.

First lets install brew if you do not have it yet:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Install espeak-ng:

/opt/homebrew/bin/brew install espeak-ng
You now should have /opt/homebrew/Cellar/espeak-ng/1.51/lib/libespeak-ng.1.dylib //your version might be different
cd to your piper root folder

Install piper-phonemize

git clone https://github.com/rhasspy/piper-phonemize.git
cd piper-phonemize
make
Once the compilation process is done you should have libpiper_phonemize.1.dylib in ./piper-phonemize/install/lib

Setting up environmental variables

Now lets create the necessary links to the libraries so that piper can find them:
Add this lines to your ~/.zprofile file before 'export PATH':

`PATH="/opt/homebrew/bin:${PATH}"

export DYLD_LIBRARY_PATH=/opt/homebrew/Cellar/espeak/1.48.04_1/lib/:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PAT=/PIPER_ROOT_DEER/piper-phonemize/lib:$DYLD_LIBRARY_PATH`

make sure you provide the correct path to PIPER_ROOT_DEER

Do not forget to source your env file:

source ~/.zprofile

You should now be able to run piper as following, make sure the path to your piper install is correct: echo 'Welcome to the world of speech synthesis!' | \ ./piper/install/piper --model ./piper/models/librits/en_US-libritts-high.onnx --output-file welcome.wav

piper_llcui.mp4

Embeddings: Local Documents and Web

llama.cui supports embeddings from a text file (see the docs folder), MongoDB, and Web (duckduckgo).

You will need to delete the existing DB folder before running llama.cui. The new database will be generated on the next request for embedding (select use database in the bottom left corner of the UI interface to generate the database)

For data format convention, llama.cui uses the anytotext.js library. You can place any doc, xlsx, docx, txt, or other text files into "docs" directory to create your vector database. All embeddings are treated localy using all-MiniLM-L6-v2 model.

Screenshots

HTML/CSS/JS code preview

Name		Name	Last commit message	Last commit date
Latest commit History 420 Commits
.github/workflows		.github/workflows
docs		docs
public		public
src		src
views		views
Alice.txt		Alice.txt
LICENSE.TXT		LICENSE.TXT
README.md		README.md
config.js		config.js
ico.svg		ico.svg
package.json		package.json
piper_install_mac.sh		piper_install_mac.sh
server.js		server.js
terminal.js		terminal.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Llama.cui is a small llama.cpp-based chat application for Node.js

Install all at once

Linux Nvidia GPU

Linux CPU

OSX

Manual Installation

Usage

Login Information

Piper integration

Linux

Getting Piper

Downloading voice models

Configure Piper

MacOS

Try piper_install_mac.sh installation script first:

Example of usage

Manual piper installation.

Install espeak-ng:

Install piper-phonemize

Setting up environmental variables

Embeddings: Local Documents and Web

Screenshots

About

Releases 2

Packages

Languages

License

dspasyuk/llama.cui

Folders and files

Latest commit

History

Repository files navigation

Llama.cui is a small llama.cpp-based chat application for Node.js

Install all at once

Linux Nvidia GPU

Linux CPU

OSX

Manual Installation

Usage

Login Information

Piper integration

Linux

Getting Piper

Downloading voice models

Configure Piper

MacOS

Try piper_install_mac.sh installation script first:

Example of usage

Manual piper installation.

Install espeak-ng:

Install piper-phonemize

Setting up environmental variables

Embeddings: Local Documents and Web

Screenshots

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages