-
Notifications
You must be signed in to change notification settings - Fork 25k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
# Conflicts: # docs/reference/inference/put-inference.asciidoc Co-authored-by: Joan Fontanals <[email protected]>
- Loading branch information
Showing
3 changed files
with
257 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,255 @@ | ||
[[infer-service-jinaai]] | ||
=== JinaAI {infer} service | ||
|
||
Creates an {infer} endpoint to perform an {infer} task with the `jinaai` service. | ||
|
||
|
||
[discrete] | ||
[[infer-service-jinaai-api-request]] | ||
==== {api-request-title} | ||
|
||
`PUT /_inference/<task_type>/<inference_id>` | ||
|
||
[discrete] | ||
[[infer-service-jinaai-api-path-params]] | ||
==== {api-path-parms-title} | ||
|
||
`<inference_id>`:: | ||
(Required, string) | ||
include::inference-shared.asciidoc[tag=inference-id] | ||
|
||
`<task_type>`:: | ||
(Required, string) | ||
include::inference-shared.asciidoc[tag=task-type] | ||
+ | ||
-- | ||
Available task types: | ||
|
||
* `text_embedding`, | ||
* `rerank`. | ||
-- | ||
|
||
[discrete] | ||
[[infer-service-jinaai-api-request-body]] | ||
==== {api-request-body-title} | ||
|
||
`chunking_settings`:: | ||
(Optional, object) | ||
include::inference-shared.asciidoc[tag=chunking-settings] | ||
|
||
`max_chunking_size`::: | ||
(Optional, integer) | ||
include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size] | ||
|
||
`overlap`::: | ||
(Optional, integer) | ||
include::inference-shared.asciidoc[tag=chunking-settings-overlap] | ||
|
||
`sentence_overlap`::: | ||
(Optional, integer) | ||
include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap] | ||
|
||
`strategy`::: | ||
(Optional, string) | ||
include::inference-shared.asciidoc[tag=chunking-settings-strategy] | ||
|
||
`service`:: | ||
(Required, string) | ||
The type of service supported for the specified task type. In this case, | ||
`jinaai`. | ||
|
||
`service_settings`:: | ||
(Required, object) | ||
include::inference-shared.asciidoc[tag=service-settings] | ||
+ | ||
-- | ||
These settings are specific to the `jinaai` service. | ||
-- | ||
|
||
`api_key`::: | ||
(Required, string) | ||
A valid API key for your JinaAI account. | ||
You can find it at https://jina.ai/embeddings/. | ||
+ | ||
-- | ||
include::inference-shared.asciidoc[tag=api-key-admonition] | ||
-- | ||
|
||
`rate_limit`::: | ||
(Optional, object) | ||
The default rate limit for the `jinaai` service is 2000 requests per minute for all task types. | ||
You can modify this using the `requests_per_minute` setting in your service settings: | ||
+ | ||
-- | ||
include::inference-shared.asciidoc[tag=request-per-minute-example] | ||
|
||
More information about JinaAI's rate limits can be found in https://jina.ai/contact-sales/#rate-limit. | ||
-- | ||
+ | ||
.`service_settings` for the `rerank` task type | ||
[%collapsible%closed] | ||
===== | ||
`model_id`:: | ||
(Required, string) | ||
The name of the model to use for the {infer} task. | ||
To review the available `rerank` compatible models, refer to https://jina.ai/reranker. | ||
===== | ||
+ | ||
.`service_settings` for the `text_embedding` task type | ||
[%collapsible%closed] | ||
===== | ||
`model_id`::: | ||
(Optional, string) | ||
The name of the model to use for the {infer} task. | ||
To review the available `text_embedding` models, refer to the | ||
https://jina.ai/embeddings/. | ||
`similarity`::: | ||
(Optional, string) | ||
Similarity measure. One of `cosine`, `dot_product`, `l2_norm`. | ||
Defaults based on the `embedding_type` (`float` -> `dot_product`, `int8/byte` -> `cosine`). | ||
===== | ||
|
||
|
||
|
||
`task_settings`:: | ||
(Optional, object) | ||
include::inference-shared.asciidoc[tag=task-settings] | ||
+ | ||
.`task_settings` for the `rerank` task type | ||
[%collapsible%closed] | ||
===== | ||
`return_documents`:: | ||
(Optional, boolean) | ||
Specify whether to return doc text within the results. | ||
`top_n`:: | ||
(Optional, integer) | ||
The number of most relevant documents to return, defaults to the number of the documents. | ||
If this {infer} endpoint is used in a `text_similarity_reranker` retriever query and `top_n` is set, it must be greater than or equal to `rank_window_size` in the query. | ||
===== | ||
+ | ||
.`task_settings` for the `text_embedding` task type | ||
[%collapsible%closed] | ||
===== | ||
`task`::: | ||
(Optional, string) | ||
Specifies the task passed to the model. | ||
Valid values are: | ||
* `classification`: use it for embeddings passed through a text classifier. | ||
* `clustering`: use it for the embeddings run through a clustering algorithm. | ||
* `ingest`: use it for storing document embeddings in a vector database. | ||
* `search`: use it for storing embeddings of search queries run against a vector database to find relevant documents. | ||
===== | ||
|
||
|
||
[discrete] | ||
[[inference-example-jinaai]] | ||
==== JinaAI service examples | ||
|
||
The following examples demonstrate how to create {infer} endpoints for `text_embeddings` and `rerank` tasks using the JinaAI service and use them in search requests. | ||
|
||
First, we create the `embeddings` service: | ||
|
||
[source,console] | ||
------------------------------------------------------------ | ||
PUT _inference/text_embedding/jinaai-embeddings | ||
{ | ||
"service": "jinaai", | ||
"service_settings": { | ||
"model_id": "jina-embeddings-v3", | ||
"api_key": "<api_key>" | ||
} | ||
} | ||
------------------------------------------------------------ | ||
// TEST[skip:uses ML] | ||
|
||
Then, we create the `rerank` service: | ||
[source,console] | ||
------------------------------------------------------------ | ||
PUT _inference/rerank/jinaai-rerank | ||
{ | ||
"service": "jinaai", | ||
"service_settings": { | ||
"api_key": "<api_key>", | ||
"model_id": "jina-reranker-v2-base-multilingual" | ||
}, | ||
"task_settings": { | ||
"top_n": 10, | ||
"return_documents": true | ||
} | ||
} | ||
------------------------------------------------------------ | ||
// TEST[skip:uses ML] | ||
|
||
Now we can create an index that will use `jinaai-embeddings` service to index the documents. | ||
|
||
[source,console] | ||
------------------------------------------------------------ | ||
PUT jinaai-index | ||
{ | ||
"mappings": { | ||
"properties": { | ||
"content": { | ||
"type": "semantic_text", | ||
"inference_id": "jinaai-embeddings" | ||
} | ||
} | ||
} | ||
} | ||
------------------------------------------------------------ | ||
// TEST[skip:uses ML] | ||
|
||
[source,console] | ||
------------------------------------------------------------ | ||
PUT jinaai-index/_bulk | ||
{ "index" : { "_index" : "jinaai-index", "_id" : "1" } } | ||
{"content": "Sarah Johnson is a talented marine biologist working at the Oceanographic Institute. Her groundbreaking research on coral reef ecosystems has garnered international attention and numerous accolades."} | ||
{ "index" : { "_index" : "jinaai-index", "_id" : "2" } } | ||
{"content": "She spends months at a time diving in remote locations, meticulously documenting the intricate relationships between various marine species. "} | ||
{ "index" : { "_index" : "jinaai-index", "_id" : "3" } } | ||
{"content": "Her dedication to preserving these delicate underwater environments has inspired a new generation of conservationists."} | ||
------------------------------------------------------------ | ||
// TEST[skip:uses ML] | ||
|
||
Now, with the index created, we can search with and without the reranker service. | ||
|
||
[source,console] | ||
------------------------------------------------------------ | ||
GET jinaai-index/_search | ||
{ | ||
"query": { | ||
"semantic": { | ||
"field": "content", | ||
"query": "who inspired taking care of the sea?" | ||
} | ||
} | ||
} | ||
------------------------------------------------------------ | ||
// TEST[skip:uses ML] | ||
|
||
[source,console] | ||
------------------------------------------------------------ | ||
POST jinaai-index/_search | ||
{ | ||
"retriever": { | ||
"text_similarity_reranker": { | ||
"retriever": { | ||
"standard": { | ||
"query": { | ||
"semantic": { | ||
"field": "content", | ||
"query": "who inspired taking care of the sea?" | ||
} | ||
} | ||
} | ||
}, | ||
"field": "content", | ||
"rank_window_size": 100, | ||
"inference_id": "jinaai-rerank", | ||
"inference_text": "who inspired taking care of the sea?" | ||
} | ||
} | ||
} | ||
------------------------------------------------------------ | ||
// TEST[skip:uses ML] |