Skip to content

Latest commit

 

History

History
120 lines (90 loc) · 2.45 KB

README.md

File metadata and controls

120 lines (90 loc) · 2.45 KB

sentence-transformers-server

Single-file Natural-Language-Processing API server to perform semantic search and sentence embedding. This uses bottle as the server and sbert as the embedding library.

Visit the official Gitlab repo or the Github mirror.

Docker Installation:

# within the sentence-transformers-server root folder:

docker build -t st-server:latest .

docker run -p 9012:9012 st-server:latest

Demo

Visit the demo page to confirm your server is reachable.

Installation

Without poetry

git clone https://gitlab.com/da_doomer/sentence-transformers-server.git
cd sentence-transformers-server
pip install -U bottle
pip install -U sentence-transformers
python server.py --port 3000 --model all-MiniLM-L6-v2

With poetry

If you have poetry you can configure a virtual environment automatically:

git clone https://gitlab.com/da_doomer/sentence-transformers-server.git
cd sentence-transformers-server
poetry install
poetry shell
python server.py --port 3000 --model all-MiniLM-L6-v2

Use

python server.py --port PORT_N --model MODEL_ID

See the list of models available in sbert.

See the provided javascript example.

Semantic search

Send a POST request to /semantic_search of type application/json and the following body structure:

{
	"query": "make stick",
	"documents": [
		"place wooden plank at 2 comma 2",
		"craft stick",
		"place stick"
	]
}

The response is of type application/json and contains the similarity of the query to the corresponding document:

{
	"similarities": [
		0.23651659488677979,
		0.7974543571472168,
		0.5554141402244568
	]
}

Sentence embedding

Send a POST request to /embedding of type application/json and the following structure:

{
	"documents": [
		"place wooden plank at 2 comma 2",
		"craft stick",
		"place stick"
	]
}

The response is of type application/json and contains for each document a list of numbers representing a vector of norm 1 which can be used with dot-product, cosine-similarity of Euclidean distance:

{
	"embeddings": [
		[...],
		[...],
		[...]
	]
}