Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport 2.17] Add doc on disk-based vector search #8345

Merged
merged 1 commit into from
Sep 20, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
193 changes: 193 additions & 0 deletions _search-plugins/knn/disk-based-vector-search.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
---
layout: default
title: Disk-based vector search
nav_order: 16
parent: k-NN search
has_children: false
---

# Disk-based vector search
**Introduced 2.17**
{: .label .label-purple}

For low-memory environments, OpenSearch provides _disk-based vector search_, which significantly reduces the operational costs for vector workloads. Disk-based vector search uses [binary quantization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/#binary-quantization), compressing vectors and thereby reducing the memory requirements. This memory optimization provides large memory savings at the cost of slightly increased search latency while still maintaining strong recall.

To use disk-based vector search, set the [`mode`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#vector-workload-modes) parameter to `on_disk` for your vector field type. This parameter will configure your index to use secondary storage.

## Creating an index for disk-based vector search

To create an index for disk-based vector search, send the following request:

```json
PUT my-vector-index
{
"mappings": {
"properties": {
"my_vector_field": {
"type": "knn_vector",
"dimension": 8,
"space_type": "innerproduct",
"data_type": "float",
"mode": "on_disk"
}
}
}
}
```
{% include copy-curl.html %}

By default, the `on_disk` mode configures the index to use the `faiss` engine and `hnsw` method. The default [`compression_level`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#compression-levels) of `32x` reduces the amount of memory the vectors require by a factor of 32. To preserve the search recall, rescoring is enabled by default. A search on a disk-optimized index runs in two phases: The compressed index is searched first, and then the results are rescored using full-precision vectors loaded from disk.

To reduce the compression level, provide the `compression_level` parameter when creating the index mapping:

```json
PUT my-vector-index
{
"mappings": {
"properties": {
"my_vector_field": {
"type": "knn_vector",
"dimension": 8,
"space_type": "innerproduct",
"data_type": "float",
"mode": "on_disk",
"compression_level": "16x"
}
}
}
}
```
{% include copy-curl.html %}

For more information about the `compression_level` parameter, see [Compression levels]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#compression-levels). Note that for `4x` compression, the `lucene` engine will be used.
{: .note}

If you need more granular fine-tuning, you can override additional k-NN parameters in the method definition. For example, to improve recall, increase the `ef_construction` parameter value:

```json
PUT my-vector-index
{
"mappings": {
"properties": {
"my_vector_field": {
"type": "knn_vector",
"dimension": 8,
"space_type": "innerproduct",
"data_type": "float",
"mode": "on_disk",
"method": {
"params": {
"ef_construction": 512
}
}
}
}
}
}
```
{% include copy-curl.html %}

The `on_disk` mode only works with the `float` data type.
{: .note}

## Ingestion

You can perform document ingestion for a disk-optimized vector index in the same way as for a regular vector index. To index several documents in bulk, send the following request:

```json
POST _bulk
{ "index": { "_index": "my-vector-index", "_id": "1" } }
{ "my_vector_field": [1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5], "price": 12.2 }
{ "index": { "_index": "my-vector-index", "_id": "2" } }
{ "my_vector_field": [2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5], "price": 7.1 }
{ "index": { "_index": "my-vector-index", "_id": "3" } }
{ "my_vector_field": [3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5], "price": 12.9 }
{ "index": { "_index": "my-vector-index", "_id": "4" } }
{ "my_vector_field": [4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5], "price": 1.2 }
{ "index": { "_index": "my-vector-index", "_id": "5" } }
{ "my_vector_field": [5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5], "price": 3.7 }
{ "index": { "_index": "my-vector-index", "_id": "6" } }
{ "my_vector_field": [6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5], "price": 10.3 }
{ "index": { "_index": "my-vector-index", "_id": "7" } }
{ "my_vector_field": [7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5], "price": 5.5 }
{ "index": { "_index": "my-vector-index", "_id": "8" } }
{ "my_vector_field": [8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5], "price": 4.4 }
{ "index": { "_index": "my-vector-index", "_id": "9" } }
{ "my_vector_field": [9.5, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5], "price": 8.9 }
```
{% include copy-curl.html %}

## Search

Search is also performed in the same way as in other index configurations. The key difference is that, by default, the `oversample_factor` of the rescore parameter is set to `3.0` (unless you override the `compression_level`). For more information, see [Rescoring quantized results using full precision]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/#rescoring-quantized-results-using-full-precision). To perform vector search on a disk-optimized index, provide the search vector:

```json
GET my-vector-index/_search
{
"query": {
"knn": {
"my_vector_field": {
"vector": [1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5],
"k": 5
}
}
}
}
```
{% include copy-curl.html %}

Similarly to other index configurations, you can override k-NN parameters in the search request:

```json
GET my-vector-index/_search
{
"query": {
"knn": {
"my_vector_field": {
"vector": [1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5],
"k": 5,
"method_params": {
"ef_search": 512
},
"rescore": {
"oversample_factor": 10.0
}
}
}
}
}
```
{% include copy-curl.html %}

[Radial search]({{site.url}}{{site.baseurl}}/search-plugins/knn/radial-search-knn/) does not support disk-based vector search.
{: .note}

## Model-based indexes

For [model-based indexes]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/#building-a-k-nn-index-from-a-model), you can specify the `on_disk` parameter in the training request in the same way that you would specify it during index creation. By default, `on_disk` mode will use the [Faiss IVF method]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#supported-faiss-methods) and a compression level of `32x`. To run the training API, send the following request:

```json
POST /_plugins/_knn/models/_train/test-model
{
"training_index": "train-index-name",
"training_field": "train-field-name",
"dimension": 8,
"max_training_vector_count": 1200,
"search_size": 100,
"description": "My model",
"space_type": "innerproduct",
"mode": "on_disk"
}
```
{% include copy-curl.html %}

This command assumes that training data has been ingested into the `train-index-name` index. For more information, see [Building a k-NN index from a model]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/#building-a-k-nn-index-from-a-model).
{: .note}

You can override the `compression_level` for disk-optimized indexes in the same way as for regular k-NN indexes.


## Next steps

- For more information about binary quantization, see [Binary quantization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/#binary-quantization).
- For more information about k-NN vector workload modes, see [Vector workload modes]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#vector-workload-modes).
Loading