Skip to content

Commit

Permalink
[Docs] ggml: add README for the embedding example with CI job (#114)
Browse files Browse the repository at this point in the history
Signed-off-by: dm4 <[email protected]>
  • Loading branch information
dm4 authored Mar 12, 2024
1 parent 059bbd0 commit 8612e9f
Show file tree
Hide file tree
Showing 2 changed files with 53 additions and 5 deletions.
17 changes: 12 additions & 5 deletions .github/workflows/llama.yml
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,18 @@ jobs:
target/wasm32-wasi/release/wasmedge-ggml-multimodel.wasm \
'describe this picture please'
- name: Embedding Example
run: |
test -f ~/.wasmedge/env && source ~/.wasmedge/env
cd wasmedge-ggml/embedding
curl -LO https://huggingface.co/second-state/All-MiniLM-L6-v2-Embedding-GGUF/resolve/main/all-MiniLM-L6-v2-ggml-model-f16.gguf
cargo build --target wasm32-wasi --release
time wasmedge --dir .:. \
--nn-preload default:GGML:AUTO:all-MiniLM-L6-v2-ggml-model-f16.gguf \
target/wasm32-wasi/release/wasmedge-ggml-llama-embedding.wasm \
default \
'hello world'
- name: Build llama-stream
run: |
cd wasmedge-ggml/llama-stream
Expand All @@ -142,11 +154,6 @@ jobs:
cd wasmedge-ggml/chatml
cargo build --target wasm32-wasi --release
- name: Build embedding
run: |
cd wasmedge-ggml/embedding
cargo build --target wasm32-wasi --release
- name: Build llava-base64-stream
run: |
cd wasmedge-ggml/llava-base64-stream
Expand Down
41 changes: 41 additions & 0 deletions wasmedge-ggml/embedding/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Embedding Example For WASI-NN with GGML Backend

> [!NOTE]
> Please refer to the [wasmedge-ggml/README.md](../README.md) for the general introduction and the setup of the WASI-NN plugin with GGML backend. This document will focus on the specific example of generating embeddings.
## Get the Model

In this example, we are going to use the pre-converted `all-MiniLM-L6-v2` model.

Download the model:

```bash
curl -LO https://huggingface.co/second-state/All-MiniLM-L6-v2-Embedding-GGUF/resolve/main/all-MiniLM-L6-v2-ggml-model-f16.gguf
```

## Parameters

> [!NOTE]
> Please check the parameters section of [wasmedge-ggml/README.md](https://github.com/second-state/WasmEdge-WASINN-examples/tree/master/wasmedge-ggml#parameters) first.
## Execute

Execute the WASM with the `wasmedge` using the named model feature to preload a large model:

```console
$ wasmedge --dir .:. \
--nn-preload default:GGML:AUTO:all-MiniLM-L6-v2-ggml-model-f16.gguf \
wasmedge-ggml-llama-embedding.wasm default

Prompt:
What's the capital of the United States?
Raw Embedding Output: {"n_embedding": 384, "embedding": [0.5426152349,-0.03840282559,-0.03644151986,0.3677068651,-0.115977712...(omitted)...,-0.003531290218]}
Interact with Embedding:
N_Embd: 384
Show the first 5 elements:
embd[0] = 0.5426152349
embd[1] = -0.03840282559
embd[2] = -0.03644151986
embd[3] = 0.3677068651
embd[4] = -0.115977712
```

0 comments on commit 8612e9f

Please sign in to comment.