Skip to content

Commit

Permalink
Make some edits for the comments.
Browse files Browse the repository at this point in the history
Signed-off-by: conggguan <[email protected]>
  • Loading branch information
conggguan committed Jun 11, 2024
1 parent b188880 commit 7a98201
Show file tree
Hide file tree
Showing 2 changed files with 34 additions and 5 deletions.
33 changes: 31 additions & 2 deletions _search-plugins/neural-sparse-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -263,8 +263,37 @@ GET my-nlp-index/_search
}
```
## Step 5: Create and enable two-phase processor (Optional)
'neural_sparse_two_phase_processor' is a new feature which introduced in OpenSearch 2.15. It can speed up the neural sparse query's time cost with negligible accurency loss
For more information, you can refer to [neural-sparse-query-two-phase-processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-sparse-query-two-phase-processor/).

This step is optional but strongly recommended, as it significantly improves the performance of neural sparse queries with almost no side effects.

'neural_sparse_two_phase_processor' is a new feature which introduced in OpenSearch 2.15. It can speed up the neural sparse query's time cost with negligible accurency loss.

Check failure on line 270 in _search-plugins/neural-sparse-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: accurency. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: accurency. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/neural-sparse-search.md", "range": {"start": {"line": 270, "column": 159}}}, "severity": "ERROR"}
You can quickly launch a pipeline based on the following API example. For more detailed information on the parameter settings and basic principles of this pipeline, please refer to [neural-sparse-query-two-phase-processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-sparse-query-two-phase-processor/).

Check warning on line 272 in _search-plugins/neural-sparse-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Please] Using 'please' is unnecessary. Remove. Raw Output: {"message": "[OpenSearch.Please] Using 'please' is unnecessary. Remove.", "location": {"path": "_search-plugins/neural-sparse-search.md", "range": {"start": {"line": 272, "column": 166}}}, "severity": "WARNING"}
```json
PUT /_search/pipeline/two_phase_search_pipeline
{
"request_processors": [
{
"neural_sparse_two_phase_processor": {
"tag": "neural-sparse",
"description": "This processor is making two-phase processor."
}
}
]
}
```
{% include copy-curl.html %}

Then choose the proper index and set the `index.search.default_pipeline` to the pipeline name. Replace the `index-name` in url with your index name.
```json

Check failure on line 289 in _search-plugins/neural-sparse-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: url. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: url. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/neural-sparse-search.md", "range": {"start": {"line": 289, "column": 124}}}, "severity": "ERROR"}
PUT /index-name/_settings
{
"index.search.default_pipeline" : "two_phase_search_pipeline"
}
```
{% include copy-curl.html %}



## Setting a default model on an index or field
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ grand_parent: Search pipelines

# NeuralSparse query two-phase processor

The `neural_sparse_two_phase_processor` search request processor is designed to set a speed-up pipeline for [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). It accelerates the neural sparse query by breaking down the original method of scoring all documents with all tokens into two steps. In the first step, it uses high-weight tokens to score the documents and filters out the top documents; in the second step, it uses low-weight tokens to fine-tune the scores of the top documents.
The `neural_sparse_two_phase_processor` search request processor is designed to set a speed-up pipeline for [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). It accelerates the neural sparse query by breaking down the original method of scoring all documents with all tokens into two steps. In the first step, it uses high-weight tokens to score the documents and filters out the top documents; in the second step, it uses low-weight tokens to rescore the scores of the top documents.

## Request fields

Expand Down Expand Up @@ -120,5 +120,5 @@ GET /my-nlp-index/_search

## Metrics

In doc-only mode, the two-phase processor will reduce the query latency by 20% to 50%, depending on the index configuration and two-phase parameters.
In bi-encoder mode, the two-phase processor can decrease the query latency by up to 90%, also depending on the index configuration and two-phase parameters.
In doc-only mode, the two-phase processor will reduce the query latency by 20% to 50%, depending on specific data distribution and dataset size in the index.
In bi-encoder mode, the two-phase processor can decrease the query latency by up to 90%, also depending on the specific data distribution and dataset size in the index.

0 comments on commit 7a98201

Please sign in to comment.