From fde4655b96995dc4f6776d857185ee2e1945c869 Mon Sep 17 00:00:00 2001 From: gui machiavelli Date: Wed, 18 Dec 2024 18:19:23 +0100 Subject: [PATCH] v1.12: New `/batches` route (#3072) --------- Co-authored-by: Tamo --- .code-samples.meilisearch.yaml | 6 + config/sidebar-reference.json | 5 + learn/async/asynchronous_operations.mdx | 4 + learn/async/filtering_tasks.mdx | 4 + learn/async/paginating_tasks.mdx | 4 + reference/api/batches.mdx | 250 ++++++++++++++++++++++++ reference/api/tasks.mdx | 17 +- 7 files changed, 287 insertions(+), 3 deletions(-) create mode 100644 reference/api/batches.mdx diff --git a/.code-samples.meilisearch.yaml b/.code-samples.meilisearch.yaml index 4220f2ce53..e73686f8c8 100644 --- a/.code-samples.meilisearch.yaml +++ b/.code-samples.meilisearch.yaml @@ -1370,6 +1370,12 @@ update_localized_attribute_settings_1: |- reset_localized_attribute_settings_1: |- curl \ -X DELETE 'http://localhost:7700/indexes/INDEX_NAME/settings/localized-attributes' +get_all_batches_1: |- + curl \ + -X GET 'http://localhost:7700/batches' +get_batch_1: |- + curl \ + -X GET 'http://localhost:7700/batches/BATCH_UID' ### Code samples for experimental features get_embedders_1: |- diff --git a/config/sidebar-reference.json b/config/sidebar-reference.json index 34fcff1780..26c07e1180 100644 --- a/config/sidebar-reference.json +++ b/config/sidebar-reference.json @@ -43,6 +43,11 @@ "label": "Tasks", "slug": "tasks" }, + { + "source": "reference/api/batches.mdx", + "label": "Batches", + "slug": "batches" + }, { "source": "reference/api/keys.mdx", "label": "Keys", diff --git a/learn/async/asynchronous_operations.mdx b/learn/async/asynchronous_operations.mdx index 4da4560449..9f3d685689 100644 --- a/learn/async/asynchronous_operations.mdx +++ b/learn/async/asynchronous_operations.mdx @@ -137,6 +137,10 @@ When you make a [request for an asynchronous operation](#which-operations-are-as **Terminating a Meilisearch instance in the middle of an asynchronous operation is completely safe** and will never adversely affect the database. +### Task batches + +Meilisearch processes tasks in batches, grouping tasks for the best possible performance. In most cases, batching should be transparent and have no impact on the overall task workflow. Use [the `/batches` route](/reference/api/batches) to obtain more information on batches and how they are processing your tasks. + ### Canceling tasks You can cancel a task while it is `enqueued` or `processing` by using [the cancel tasks endpoint](/reference/api/tasks#cancel-tasks). Doing so changes a task's `status` to `canceled`. diff --git a/learn/async/filtering_tasks.mdx b/learn/async/filtering_tasks.mdx index 911b896ca7..1985ac5735 100644 --- a/learn/async/filtering_tasks.mdx +++ b/learn/async/filtering_tasks.mdx @@ -9,6 +9,10 @@ Querying the [get tasks endpoint](/reference/api/tasks#get-tasks) returns all ta This guide shows you how to use query parameters to filter tasks and obtain a more readable list of asynchronous operations. + +Filtering batches with [the `/batches` route](/reference/api/batches) follows the same rules as filtering tasks. Keep in mind that many `/batches` parameters such as `uids` target the tasks included in batches, instead of the batches themselves. + + ## Requirements - a command-line terminal diff --git a/learn/async/paginating_tasks.mdx b/learn/async/paginating_tasks.mdx index 6806976cf4..b1c326c360 100644 --- a/learn/async/paginating_tasks.mdx +++ b/learn/async/paginating_tasks.mdx @@ -7,6 +7,10 @@ description: Meilisearch uses a task queue to handle asynchronous operations. Th By default, Meilisearch returns a list of 20 tasks for each request when you query the [get tasks endpoint](/reference/api/tasks#get-tasks). This guide shows you how to navigate the task list using query parameters. + +Paginating batches with [the `/batches` route](/reference/api/batches) follows the same rules as paginating tasks. + + ## Configuring the number of returned tasks Use the `limit` parameter to change the number of returned tasks: diff --git a/reference/api/batches.mdx b/reference/api/batches.mdx new file mode 100644 index 0000000000..a1fd1ff894 --- /dev/null +++ b/reference/api/batches.mdx @@ -0,0 +1,250 @@ +--- +title: Batches — Meilisearch API reference +description: The /batches route allows you to monitor how Meilisearch is grouping and processing asynchronous operations. +--- + +# Batches + +The `/batches` route gives information about the progress of batches of [asynchronous operations](/learn/async/asynchronous_operations). + +## Batch object + +```json +{ + "uid": 0, + "details": { + "receivedDocuments": 6, + "indexedDocuments": 6 + }, + "stats": { + "totalNbTasks": 1, + "status": { + "succeeded": 1 + }, + "types": { + "documentAdditionOrUpdate": 1 + }, + "indexUids": { + "INDEX_NAME": 1 + } + }, + "duration": "PT0.250518S", + "startedAt": "2024-12-10T15:20:30.18182Z", + "finishedAt": "2024-12-10T15:20:30.432338Z", + "progress": { + "steps": [ + { + "currentStep": "extracting words", + "finished": 2, + "total": 9, + }, + { + "currentStep": "document", + "finished": 30546, + "total": 31944, + } + ], + "percentage": 32.8471 + } +} +``` + +### `uid` + +**Type**: Integer
+**Description**: Unique sequential identifier of the batch. Starts at `0` and increases by one for every new patch. + +### `details` + +**Type**: Object
+**Description**: Basic information on the types tasks in a batch. Consult the [task object reference](/reference/api/tasks#details) for an exhaustive list of possible values. + +### `progress` + +**Type**: Object
+**Description**: Object containing two fields: `steps` and `percentage`. Once Meilisearch has fully processed a batch, its `progress` is set to `null`. + +#### `steps` + +Information about the current operations Meilisearch is performing in this batch. A step may consist of multiple substeps. + +| Name | Description | +| :-----------------| :------------------------------------------------- | +| **`currentStep`** | A string describing the operation | +| **`total`** | The total number of operations in the step | +| **`finished`** | The number of operations Meilisearch has completed | + + +If Meilisearch is taking longer than expected to process a batch, monitor the `steps` array. If the `finished` field of the last item in the `steps` array does not update, Meilisearch may be stuck. + + +#### `percentage` + +The percentage of completed operations, calculated from all current steps and substeps. This value is a rough estimate and may not always reflect the current state of the batch due to how different steps are processed more quickly than others. + +### `stats` + +**Type**: Object
+**Description**: Detailed information on the payload of all tasks in a batch. + +#### `totalNbTasks` + +Number of tasks in the batch. + +#### `status` + +Object listing the [status of each task](/reference/api/tasks#status) in the batch. Contains five keys whose values correspond to the number of tasks with that status. + +#### `types` + +List with the `types` of tasks contained in the batch. + +#### `indexUids` + +List of the number of tasks in the batch separated by the indexes they affect. + +### `duration` + +**Type**: String
+**Description**: The total elapsed time the batch spent in the `processing` state, in [ISO 8601](https://www.ionos.com/digitalguide/websites/web-development/iso-8601/) format. Set to `null` while the batch is processing tasks + +### `startedAt` + +**Type**: String
+**Description**: The date and time when the batch began `processing`, in [RFC 3339](https://www.ietf.org/rfc/rfc3339.txt) format + +### `finishedAt` + +**Type**: String
+**Description**: The date and time when the tasks finished `processing`, whether `failed`, `succeeded`, or `canceled`, in [RFC 3339](https://www.ietf.org/rfc/rfc3339.txt) format + +## Get batches + + + +List all batches, regardless of index. The batch objects are contained in the `results` array. + +Batches are always returned in descending order of `uid`. This means that by default, **the most recently created batch objects appear first**. + +Batch results are [paginated](/learn/async/paginating_tasks) and can be [filtered](/learn/async/filtering_tasks) with query parameters. + + +Some query parameters for `/batches`, such as `uids` and `statuses`, target tasks instead of batches. + +For example, `?uids=0` returns a batch containing the task with a `taskUid` equal to `0`, instead of a batch with a `batchUid` equal to `0`. + + +### Query parameters + +| Query Parameter | Default Value | Description | +| ---------------------- | ------------------------------- | ----------------------------------------------------------------------------------------------------------------- | +| **`uids`** | `*` (all task uids) | Select batches containing the tasks with the specified `uid`s. Separate multiple task `uids` with a comma (`,`) | +| **`batchUids`** | `*` (all batch uids) | Filter batches by their `uid`. Separate multiple batch `uids` with a comma (`,`) | +| **`indexUids`** | `*` (all indexes) | Select batches containing tasks affecting the specified indexes. Separate multiple `indexUids` with a comma (`,`) | +| **`statuses`** | `*` (all statuses) | Select batches containing tasks with the specified `status`. Separate multiple task `statuses` with a comma (`,`) | +| **`types`** | `*` (all types) | Select batches containing tasks with the specified `type`. Separate multiple task `types` with a comma (`,`) | +| **`limit`** | `20` | Number of batches to return | +| **`from`** | `uid` of the last created batch | `uid` of the first batch returned | +| **`reverse`** | `false` | If `true`, returns results in the reverse order, from oldest to most recent | +| **`beforeEnqueuedAt`** | `*` (all tasks) | Select batches containing tasks with the specified `enqueuedAt` field | +| **`beforeStartedAt`** | `*` (all tasks) | Select batches containing tasks with the specified `startedAt` field | +| **`beforeFinishedAt`** | `*` (all tasks) | Select batches containing tasks with the specified `finishedAt` field | +| **`afterEnqueuedAt`** | `*` (all tasks) | Select batches containing tasks with the specified `enqueuedAt` field | +| **`afterStartedAt`** | `*` (all tasks) | Select batches containing tasks with the specified `startedAt` field | +| **`afterFinishedAt`** | `*` (all tasks) | Select batches containing tasks with the specified `finishedAt` field | + +### Response + +| Name | Type | Description | +| :------------ | :------ | :----------------------------------------------------------------------------------------------------------------------------- | +| **`results`** | Array | An array of [batch objects](#batch-object) | +| **`total`** | Integer | Total number of batches matching the filter or query | +| **`limit`** | Integer | Number of batches returned | +| **`from`** | Integer | `uid` of the first batch returned | +| **`next`** | Integer | Value passed to `from` to view the next "page" of results. When the value of `next` is `null`, there are no more tasks to view | + +### Example + + + +#### Response: `200 Ok` + +```json +{ + "results": [ + { + "uid": 2, + "details": { + "stopWords": [ + "of", + "the" + ] + }, + "progress": null, + "stats": { + "totalNbTasks": 1, + "status": { + "succeeded": 1 + }, + "types": { + "settingsUpdate": 1 + }, + "indexUids": { + "INDEX_NAME": 1 + } + }, + "duration": "PT0.110083S", + "startedAt": "2024-12-10T15:49:04.995321Z", + "finishedAt": "2024-12-10T15:49:05.105404Z" + } + ], + "total": 3, + "limit": 1, + "from": 2, + "next": 1 +} +``` + +## Get one batch + + + +Get a single batch. + +### Path parameters + +| Name | Type | Description | +| :---------------- | :----- | :----------------------------------- | +| **`batch_uid`** * | String | [`uid`](#uid) of the requested batch | + +### Example + + + +#### Response: `200 Ok` + +```json +{ + "uid": 1, + "details": { + "receivedDocuments": 1, + "indexedDocuments": 1 + }, + "progress": null, + "stats": { + "totalNbTasks": 1, + "status": { + "succeeded": 1 + }, + "types": { + "documentAdditionOrUpdate": 1 + }, + "indexUids": { + "INDEX_NAME": 1 + } + }, + "duration": "PT0.364788S", + "startedAt": "2024-12-10T15:48:49.672141Z", + "finishedAt": "2024-12-10T15:48:50.036929Z" +} +``` diff --git a/reference/api/tasks.mdx b/reference/api/tasks.mdx index c3d290517f..f86cd9baa1 100644 --- a/reference/api/tasks.mdx +++ b/reference/api/tasks.mdx @@ -12,7 +12,8 @@ The `/tasks` route gives information about the progress of [asynchronous operati ```json { "uid": 4, - "indexUid" :"movie", + "batchUids": 0, + "indexUid":"movie", "status": "failed", "type": "indexDeletion", "canceledBy": null, @@ -35,10 +36,19 @@ The `/tasks` route gives information about the progress of [asynchronous operati ### `uid` **Type**: Integer
-**Description**: Unique sequential identifier of the task +**Description**: Unique sequential identifier of the task. -The task `uid` is incremented **globally.** +The task `uid` is incremented across all indexes in an instance. + + +### `batchUid` + +**Type**: Integer
+**Description**: Unique sequential identifier of the batch this task belongs to. + + +The batch `uid` is incremented across all indexes in an instance. ### `indexUid` @@ -220,6 +230,7 @@ Task results are [paginated](/learn/async/paginating_tasks) and can be [filtered | Query Parameter | Default Value | Description | | :--------------------- | :----------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **`uids`** | `*` (all uids) | [Filter tasks](/learn/async/filtering_tasks) by their `uid`. Separate multiple task `uids` with a comma (`,`) | +| **`batchUids`** | `*` (all batch uids) | [Filter tasks](/learn/async/filtering_tasks) by their `batchUid`. Separate multiple `batchUids` with a comma (`,`) | | **`statuses`** | `*` (all statuses) | [Filter tasks](/learn/async/filtering_tasks) by their `status`. Separate multiple task `statuses` with a comma (`,`) | | **`types`** | `*` (all types) | [Filter tasks](/learn/async/filtering_tasks) by their `type`. Separate multiple task `types` with a comma (`,`) | | **`indexUids`** | `*` (all indexes) | [Filter tasks](/learn/async/filtering_tasks) by their `indexUid`. Separate multiple task `indexUids` with a comma (`,`). Case-sensitive |