Merge branch 'main' into centos7-deprecation

opensearch-project · Jul 23, 2024 · 1689548 · 1689548
2 parents 64e089a + 50eed6b
commit 1689548
Show file tree

Hide file tree

Showing 31 changed files with 918 additions and 60 deletions.
diff --git a/.gitignore b/.gitignore
@@ -6,3 +6,4 @@ Gemfile.lock
 .idea
 *.iml
 .jekyll-cache
+.project
diff --git a/_api-reference/index-apis/rollover.md b/_api-reference/index-apis/rollover.md
@@ -0,0 +1,195 @@
+---
+layout: default
+title: Rollover Index
+parent: Index APIs
+nav_order: 63
+---
+
+# Rollover Index
+Introduced 1.0
+{: .label .label-purple }
+
+The Rollover Index API creates a new index for a data stream or index alias based on the `wait_for_active_shards` setting.
+
+## Path and HTTP methods
+
+```json
+POST /<rollover-target>/_rollover/
+POST /<rollover-target>/_rollover/<target-index>
+```
+
+## Rollover types
+
+You can roll over a data stream, an index alias with one index, or an index alias with a write index.
+
+### Data stream
+
+When you perform a rollover operation on a data stream, the API generates a fresh write index for that stream. Simultaneously, the stream's preceding write index transforms into a regular backing index. Additionally, the rollover process increments the generation count of the data stream. Data stream rollovers do not support specifying index settings in the request body.
+
+### Index alias with one index
+
+When initiating a rollover on an index alias associated with a single index, the API generates a new index and disassociates the original index from the alias.
+
+### Index alias with a write index
+
+When an index alias references multiple indexes, one index must be designated as the write index. During a rollover, the API creates a new write index with its `is_write_index` property set to `true` while updating the previous write index by setting its `is_write_index property` to `false.`
+
+## Incrementing index names for an alias
+
+During the index alias rollover process, if you don't specify a custom name and the current index's name ends with a hyphen followed by a number (for example, `my-index-000001` or `my-index-3`), then the rollover operation will automatically increment that number for the new index's name. For instance, rolling over `my-index-000001` will generate `my-index-000002`. The numeric portion is always padded with leading zeros to ensure a consistent length of six characters.
+
+## Using date math with index rollovers
+
+When using an index alias for time-series data, you can leverage [date math](https://opensearch.org/docs/latest/field-types/supported-field-types/date/) in the index name to track the rollover date. For example, you can create an alias pointing to `my-index-{now/d}-000001`. If you create an alias on June 11, 2029, then the index name would be `my-index-2029.06.11-000001`. For a rollover on June 12, 2029, the new index would be named `my-index-2029.06.12-000002`. See [Roll over an index alias with a write index](#rolling-over-an-index-alias-with-a-write-index) for a practical example.
+
+## Path parameters
+
+The Rollover Index API supports the parameters listed in the following table.
+
+Parameter | Type | Description 
+:--- | :--- | :--- 
+`<rollover-target>` | String | The name of the data stream or index alias to roll over. Required. |
+`<target-index>` | String | The name of the index to create. Supports date math. Data streams do not support this parameter. If the name of the alias's current write index does not end with `-` and a number, such as `my-index-000001` or `my-index-2`, then the parameter is required. 
+
+## Query parameters
+
+The following table lists the supported query parameters.
+
+Parameter | Type | Description 
+:--- | :--- | :--- 
+`cluster_manager_timeout` | Time | The amount of time to wait for a connection to the cluster manager node. Default is `30s`.
+`timeout` | Time | The amount of time to wait for a response. Default is `30s`.
+`wait_for_active_shards` | String | The number of active shards that must be available before OpenSearch processes the request. Default is `1` (only the primary shard). You can also set to `all` or a positive integer. Values greater than `1` require replicas. For example, if you specify a value of `3`, then the index must have two replicas distributed across two additional nodes in order for the operation to succeed.
+
+## Request body
+
+The following request body parameters are supported.
+
+### `alias`
+
+The `alias` parameter specifies the alias name as the key. It is required when the `template` option exists in the request body. The object body contains the following optional parameters.
+
+
+Parameter | Type | Description
+:--- | :--- | :---
+`filter` | Query DSL object | The query that limits the number of documents that the alias can access.
+`index_routing` | String | The value that routes indexing operations to a specific shard. When specified, overwrites the `routing` value for indexing operations.
+`is_hidden` | Boolean | Hides or unhides the alias. When `true`, the alias is hidden. Default is `false`. Indexes for the alias must have matching values for this setting.
+`is_write_index` | Boolean | Specifies the write index. When `true`, the index is the write index for the alias. Default is `false`.
+`routing` | String | The value used to route index and search operations to a specific shard.
+`search_routing` | String | Routes search operations to a specific shard. When specified, it overwrites `routing` for search operations.
+
+### `mappings`
+
+The `mappings` parameter specifies the index field mappings. It is optional. See [Mappings and field types](https://opensearch.org/docs/latest/field-types/) for more information.
+
+### `conditions`
+
+The `conditions` parameter is an optional object defining criteria for triggering the rollover. When provided, OpenSearch only rolls over if the current index satisfies one or more specified conditions. If omitted, then the rollover occurs unconditionally without prerequisites.
+
+The object body supports the following parameters.
+
+Parameter | Type | Description 
+:--- | :--- | :--- 
+| `max_age` | Time units | Triggers a rollover after the maximum elapsed time since index creation is reached. The elapsed time is always calculated since the index creation time, even if the index origination date is configured to a custom date, such as when using the `index.lifecycle.parse_origination_date` or `index.lifecycle.origination_date` settings. Optional. |
+`max_docs` | Integer | Triggers a rollover after the specified maximum number of documents, excluding documents added since the last refresh and documents in replica shards. Optional. 
+`max_size` | Byte units  | Triggers a rollover when the index reaches a specified size, calculated as the total size of all primary shards. Replicas are not counted. Use the `_cat indices` API and check the `pri.store.size` value to see the current index size. Optional.
+`max_primary_shard_size` | Byte units  | Triggers a rollover when the largest primary shard in the index reaches a certain size. This is the maximum size of the primary shards in the index. As with `max_size`, replicas are ignored. To see the current shard size, use the `_cat shards` API. The `store` value shows the size of each shard, and `prirep` indicates whether a shard is a primary (`p`) or a replica (`r`). Optional.
+
+### `settings`
+
+The `settings` parameter specifies the index configuration options. See [Index settings](https://opensearch.org/docs/latest/install-and-configure/configuring-opensearch/index-settings/) for more information.
+
+## Example requests
+
+The following examples illustrate using the Rollover Index API. A rollover occurs when one or more of the specified conditions are met:
+
+- The index was created 5 or more days ago.
+- The index contains 500 or more documents.
+- The index's largest primary shard is 100 GB or larger.
+
+### Rolling over a data stream
+
+The following request rolls over the data stream if the current write index meets any of the specified conditions:
+
+```json
+POST my-data-stream/_rollover
+{
+  "conditions": {
+    "max_age": "5d",
+    "max_docs": 500,
+    "max_primary_shard_size": "100gb"
+  }
+}
+```
+{% include copy-curl.html %}
+
+### Rolling over an index alias with a write index
+
+The following request creates a date-time index and sets it as the write index for `my-alias`:
+
+```json
+PUT <my-index-{now/d}-000001>
+PUT %3Cmy-index-%7Bnow%2Fd%7D-000001%3E
+{
+  "aliases": {
+    "my-alias": {
+      "is_write_index": true
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+The next request performs a rollover using the alias:
+
+```json
+POST my-alias/_rollover
+{
+  "conditions": {
+    "max_age": "5d",
+    "max_docs": 500,
+    "max_primary_shard_size": "100gb"
+  }
+}
+```
+{% include copy-curl.html %}
+
+### Specifying settings during a rollover
+
+In most cases, you can use an index template to automatically configure the indexes created during a rollover operation. However, when rolling over an index alias, you can use the Rollover Index API to introduce additional index settings or override the settings defined in the template by sending the following request:
+
+```json
+POST my-alias/_rollover
+{
+  "settings": {
+    "index.number_of_shards": 4
+  }
+}
+```
+{% include copy-curl.html %}
+
+
+## Example response
+
+OpenSearch returns the following response confirming that all conditions except `max_primary_shard_size` were met:
+
+```json
+{
+  "acknowledged": true,
+  "shards_acknowledged": true,
+  "old_index": ".ds-my-data-stream-2029.06.11-000001",
+  "new_index": ".ds-my-data-stream-2029.06.12-000002",
+  "rolled_over": true,
+  "dry_run": false,
+  "conditions": {
+    "[max_age: 5d]": true,
+    "[max_docs: 500]": true,
+    "[max_primary_shard_size: 100gb]": false
+  }
+}
+```
+
+
+
+
diff --git a/_api-reference/render-template.md b/_api-reference/render-template.md
@@ -44,7 +44,7 @@ Both of the following request examples use the search template with the template
   "source": {
     "query": {
       "match": {
-        "play_name": "{{play_name}}"
+        "play_name": "{% raw %}{{play_name}}{% endraw %}"
       }
     }
   },
@@ -76,11 +76,11 @@ If you don't want to use a saved template, or want to test a template before sav
 ```
 {
   "source": {
-    "from": "{{from}}{{^from}}10{{/from}}",
-    "size": "{{size}}{{^size}}10{{/size}}",
+     "from": "{% raw %}{{from}}{{^from}}0{{/from}}{% endraw %}",
+     "size": "{% raw %}{{size}}{{^size}}10{{/size}}{% endraw %}",
     "query": {
       "match": {
-        "play_name": "{{play_name}}"
+        "play_name": "{% raw %}{{play_name}}{% endraw %}"
       }
     }
   },

diff --git a/_automating-configurations/api/deprovision-workflow.md b/_automating-configurations/api/deprovision-workflow.md
@@ -9,7 +9,9 @@ nav_order: 70
 
 When you no longer need a workflow, you can deprovision its resources. Most workflow steps that create a resource have corresponding workflow steps to reverse that action. To retrieve all resources currently created for a workflow, call the [Get Workflow Status API]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-status/). When you call the Deprovision Workflow API, resources included in the `resources_created` field of the Get Workflow Status API response will be removed using a workflow step corresponding to the one that provisioned them.
 
-The workflow executes the provisioning workflow steps in reverse order. If failures occur because of resource dependencies, such as preventing deletion of a registered model if it is still deployed, the workflow attempts retries.
+The workflow executes the provisioning steps in reverse order. If a failure occurs because of a resource dependency, such as trying to delete a registered model that is still deployed, then the workflow retries the failing step as long as at least one resource was deleted.
+
+To prevent data loss, resources created using the `create_index`, `create_search_pipeline`, and `create_ingest_pipeline` steps require the resource ID to be included in the `allow_delete` parameter.
 
 ## Path and HTTP methods
 
@@ -24,6 +26,7 @@ The following table lists the available path parameters.
 | Parameter | Data type | Description |
 | :--- | :--- | :--- |
 | `workflow_id` | String | The ID of the workflow to be deprovisioned. Required. |
+| `allow-delete` | String | A comma-separated list of resource IDs to be deprovisioned. Required if deleting resources of type `index_name` or `pipeline_id`. |
 
 ### Example request
 
@@ -53,6 +56,14 @@ If deprovisioning did not completely remove all resources, OpenSearch responds w
 In some cases, the failure happens because of another dependent resource that took some time to be removed. In this case, you can attempt to send the same request again.
 {: .tip}
 
+If deprovisioning required the `allow_delete` parameter, then OpenSearch responds with a `403 (FORBIDDEN)` status and identifies the resources that were not deprovisioned:
+
+```json
+{
+    "error": "These resources require the allow_delete parameter to deprovision: [index_name my-index]."
+}
+```
+
 To obtain a more detailed deprovisioning status than is provided by the summary in the error response, query the [Get Workflow Status API]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-status/). 
 
 On success, the workflow returns to a `NOT_STARTED` state. If some resources have not yet been removed, they are provided in the response.
diff --git a/_data-prepper/pipelines/configuration/processors/convert_entry_type.md b/_data-prepper/pipelines/configuration/processors/convert_entry_type.md
@@ -14,10 +14,22 @@ The `convert_entry_type` processor converts a value type associated with the spe
 
 You can configure the `convert_entry_type` processor with the following options.
 
+<!--
+This table is autogenerated. Do not edit it.
+- name: convert_entry_type
+- pluginType: processor
+- source: https://github.com/opensearch-project/data-prepper/blob/7d15115c281687aab50e5c471fd210cb1ef90fc5/data-prepper-plugins/mutate-event-processors/src/main/java/org/opensearch/dataprepper/plugins/processor/mutateevent/ConvertEntryTypeProcessorConfig.java
+-->
+
 | Option | Required | Description |
 | :--- | :--- | :--- |
-| `key`| Yes | Keys whose value needs to be converted to a different type. |
-| `type` | No | Target type for the key-value pair. Possible values are `integer`, `double`, `string`, and `Boolean`. Default value is `integer`. |
+| `key`| Yes | Key whose value needs to be converted to a different type. |
+| `keys`| Yes | Keys whose value needs to be converted to a different type. |
+| `type` | No | Target type for the key-value pair. Possible values are `integer`, `long`, `double`, `big_decimal`, `string`, and `boolean`. Default value is `integer`. |
+| `null_values` | No | String representation of what constitutes a `null` value. If the field value equals one of these strings, then the value is considered `null` and is converted to `null`. |
+| `scale` | No | Modifies the scale of the `big_decimal` when converting to a `big_decimal`. The default value is `0`. |
+| `tags_on_failure` | No | A list of tags to be added to the event metadata when the event fails to convert. |
+| `convert_when` | No | Specifies a condition using a [Data Prepper expression]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/expression-syntax/) for performing the `convert_entry_type` operation. If specified, the `convert_entry_type` operation runs only when the expression evaluates to `true`. |
 
 ## Usage
 

diff --git a/_data-prepper/pipelines/configuration/processors/delay.md b/_data-prepper/pipelines/configuration/processors/delay.md
@@ -0,0 +1,27 @@
+---
+layout: default
+title: delay
+parent: Processors
+grand_parent: Pipelines
+nav_order: 41
+---
+
+# delay
+
+This processor will add a delay into the processor chain. Typically, you should use this only for testing, experimenting, and debugging.
+
+## Configuration
+
+Option | Required | Type | Description
+:--- | :--- | :--- | :---
+`for` | No | Duration | The duration of time to delay. Defaults to `1s`.
+
+## Usage
+
+The following example shows using the `delay` processor to delay for 2 seconds.
+
+```yaml
+processor:
+  - delay:
+      for: 2s
+```
diff --git a/...onfiguration/processors/delete-entries.md → ...onfiguration/processors/delete_entries.md b/...onfiguration/processors/delete-entries.md → ...onfiguration/processors/delete_entries.md
@@ -3,7 +3,7 @@ layout: default
 title: delete_entries
 parent: Processors
 grand_parent: Pipelines
-nav_order: 41
+nav_order: 43
 ---
 
 # delete_entries

diff --git a/_data-prepper/pipelines/configuration/processors/mutate-event.md b/_data-prepper/pipelines/configuration/processors/mutate-event.md
@@ -13,7 +13,7 @@ Mutate event processors allow you to modify events in Data Prepper. The followin
 * [add_entries]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/add-entries/) allows you to add entries to an event.
 * [convert_entry_type]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/convert_entry_type/) allows you to convert value types in an event.
 * [copy_values]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/copy-values/) allows you to copy values within an event.
-* [delete_entries]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/delete-entries/) allows you to delete entries from an event.
+* [delete_entries]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/delete_entries/) allows you to delete entries from an event.
 * [list_to_map]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/list-to-map) allows you to convert list of objects from an event where each object contains a `key` field into a map of target keys.
 * `map_to_list` allows you to convert a map of objects from an event, where each object contains a `key` field, into a list of target keys.
 * [rename_keys]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/rename-keys/) allows you to rename keys in an event.

diff --git a/...nes/configuration/processors/parse-ion.md → ...nes/configuration/processors/parse_ion.md b/...nes/configuration/processors/parse-ion.md → ...nes/configuration/processors/parse_ion.md
@@ -14,12 +14,22 @@ The `parse_ion` processor parses [Amazon Ion](https://amazon-ion.github.io/ion-d
 
 You can configure the `parse_ion` processor with the following options.
 
+<!--
+This table is autogenerated. Do not edit it.
+- name: parse_ion
+- pluginType: processor
+- source: https://github.com/opensearch-project/data-prepper/blob/253e59245fd9c39c959c1c8caaeff1b226a5a0ab/data-prepper-plugins/parse-json-processor/src/main/java/org/opensearch/dataprepper/plugins/processor/parse/ion/ParseIonProcessorConfig.java
+-->
+
 | Option | Required | Type | Description |
 | :--- | :--- | :--- | :--- | 
 | `source` | No | String | The field in the `event` that is parsed. Default value is `message`. |
 | `destination` | No | String | The destination field of the parsed JSON. Defaults to the root of the `event`. Cannot be `""`, `/`, or any white-space-only `string` because these are not valid `event` fields. |
 | `pointer` | No | String | A JSON pointer to the field to be parsed. There is no `pointer` by default, meaning that the entire `source` is parsed. The `pointer` can access JSON array indexes as well. If the JSON pointer is invalid, then the entire `source` data is parsed into the outgoing `event`. If the key that is pointed to already exists in the `event` and the `destination` is the root, then the pointer uses the entire path of the key. |
-| `tags_on_failure` | No | String | A list of strings that specify the tags to be set in the event that the processors fails or an unknown exception occurs while parsing. 
+| `parse_when` | No | String | Specifies under which conditions the processor should perform parsing. Default is no condition. Accepts a Data Prepper expression string following the [Expression syntax]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/expression-syntax/). |
+| `overwrite_if_destination_exists` | No | Boolean | Overwrites the destination if set to `true`. Set to `false` to prevent changing a destination value that exists. Defaults is `true`. |
+| `delete_source` | No | Boolean | If set to `true`, then the source field is deleted. Defaults is `false`. |
+| `tags_on_failure` | No | String | A list of strings specifying the tags to be set in the event that the processor fails or an unknown exception occurs during parsing.
 
 ## Usage
-Original file line number
+Diff line change
@@ Expand Up / @@ -6,3 +6,4 @@ Gemfile.lock @@
     .idea
     *.iml
     .jekyll-cache
+    .project