Skip to content

Commit

Permalink
Merge branch 'main' into improve-shard-allocation-awareness-docs
Browse files Browse the repository at this point in the history
  • Loading branch information
patelsmit32123 authored Sep 17, 2024
2 parents c90d980 + d15c7bf commit f664e76
Show file tree
Hide file tree
Showing 10 changed files with 273 additions and 92 deletions.
2 changes: 1 addition & 1 deletion _api-reference/index-apis/create-index-template.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Parameter | Type | Description
`priority` | Integer | A number that determines which index templates take precedence during the creation of a new index or data stream. OpenSearch chooses the template with the highest priority. When no priority is given, the template is assigned a `0`, signifying the lowest priority. Optional.
`template` | Object | The template that includes the `aliases`, `mappings`, or `settings` for the index. For more information, see [#template]. Optional.
`version` | Integer | The version number used to manage index templates. Version numbers are not automatically set by OpenSearch. Optional.

`context` | Object | (Experimental) The `context` parameter provides use-case-specific predefined templates that can be applied to an index. Among all settings and mappings declared for a template, context templates hold the highest priority. For more information, see [index-context]({{site.url}}{{site.baseurl}}/im-plugin/index-context/).

### Template

Expand Down
2 changes: 1 addition & 1 deletion _api-reference/index-apis/create-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ timeout | Time | How long to wait for the request to return. Default is `30s`.

## Request body

As part of your request, you can optionally specify [index settings]({{site.url}}{{site.baseurl}}/im-plugin/index-settings/), [mappings]({{site.url}}{{site.baseurl}}/field-types/index/), and [aliases]({{site.url}}{{site.baseurl}}/opensearch/index-alias/) for your newly created index.
As part of your request, you can optionally specify [index settings]({{site.url}}{{site.baseurl}}/im-plugin/index-settings/), [mappings]({{site.url}}{{site.baseurl}}/field-types/index/), [aliases]({{site.url}}{{site.baseurl}}/opensearch/index-alias/), and [index context]({{site.url}}{{site.baseurl}}/opensearch/index-context/).

## Example request

Expand Down
26 changes: 17 additions & 9 deletions _api-reference/snapshots/get-snapshot-status.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,25 +22,33 @@ Path parameters are optional.

| Parameter | Data type | Description |
:--- | :--- | :---
| repository | String | Repository containing the snapshot. |
| snapshot | String | Snapshot to return. |
| repository | String | The repository containing the snapshot. |
| snapshot | List | The snapshot(s) to return. |
| index | List | The indexes to include in the response. |

Three request variants provide flexibility:

* `GET _snapshot/_status` returns the status of all currently running snapshots in all repositories.

* `GET _snapshot/<repository>/_status` returns all currently running snapshots in the specified repository. This is the preferred variant.

* `GET _snapshot/<repository>/<snapshot>/_status` returns detailed status information for a specific snapshot in the specified repository, regardless of whether it's currently running or not.
* `GET _snapshot/<repository>/<snapshot>/_status` returns detailed status information for a specific snapshot(s) in the specified repository, regardless of whether it's currently running.

Using the API to return state for other than currently running snapshots can be very costly for (1) machine machine resources and (2) processing time if running in the cloud. For each snapshot, each request causes file reads from all a snapshot's shards.
* `GET /_snapshot/<repository>/<snapshot>/<index>/_status` returns detailed status information only for the specified indexes in a specific snapshot in the specified repository. Note that this endpoint works only for indexes belonging to a specific snapshot.

Snapshot API calls only work if the total number of shards across the requested resources, such as snapshots and indexes created from snapshots, is smaller than the limit specified by the following cluster setting:

- `snapshot.max_shards_allowed_in_status_api`(Dynamic, integer): The maximum number of shards that can be included in the Snapshot Status API response. Default value is `200000`. Not applicable for [shallow snapshots v2]({{site.url}}{{site.baseurl}}/tuning-your-cluster/availability-and-recovery/remote-store/snapshot-interoperability##shallow-snapshot-v2), where the total number and sizes of files are returned as 0.


Using the API to return the state of snapshots that are not currently running can be very costly in terms of both machine resources and processing time when querying data in the cloud. For each snapshot, each request causes a file read of all of the snapshot's shards.
{: .warning}

## Request fields

| Field | Data type | Description |
:--- | :--- | :---
| ignore_unavailable | Boolean | How to handles requests for unavailable snapshots. If `false`, the request returns an error for unavailable snapshots. If `true`, the request ignores unavailable snapshots, such as those that are corrupted or temporarily cannot be returned. Defaults to `false`.|
| ignore_unavailable | Boolean | How to handle requests for unavailable snapshots and indexes. If `false`, the request returns an error for unavailable snapshots and indexes. If `true`, the request ignores unavailable snapshots and indexes, such as those that are corrupted or temporarily cannot be returned. Default is `false`.|

## Example request

Expand Down Expand Up @@ -375,18 +383,18 @@ The `GET _snapshot/my-opensearch-repo/my-first-snapshot/_status` request returns
:--- | :--- | :---
| repository | String | Name of repository that contains the snapshot. |
| snapshot | String | Snapshot name. |
| uuid | String | Snapshot Universally unique identifier (UUID). |
| uuid | String | A snapshot's universally unique identifier (UUID). |
| state | String | Snapshot's current status. See [Snapshot states](#snapshot-states). |
| include_global_state | Boolean | Whether the current cluster state is included in the snapshot. |
| shards_stats | Object | Snapshot's shard counts. See [Shard stats](#shard-stats). |
| stats | Object | Details of files included in the snapshot. `file_count`: number of files. `size_in_bytes`: total of all fie sizes. See [Snapshot file stats](#snapshot-file-stats). |
| stats | Object | Information about files included in the snapshot. `file_count`: number of files. `size_in_bytes`: total size of all files. See [Snapshot file stats](#snapshot-file-stats). |
| index | list of Objects | List of objects that contain information about the indices in the snapshot. See [Index objects](#index-objects).|

##### Snapshot states

| State | Description |
:--- | :--- |
| FAILED | The snapshot terminated in an error and no data was stored. |
| FAILED | The snapshot terminated in an error and no data was stored. |
| IN_PROGRESS | The snapshot is currently running. |
| PARTIAL | The global cluster state was stored, but data from at least one shard was not stored. The `failures` property of the [Create snapshot]({{site.url}}{{site.baseurl}}/api-reference/snapshots/create-snapshot) response contains additional details. |
| SUCCESS | The snapshot finished and all shards were stored successfully. |
Expand Down Expand Up @@ -420,4 +428,4 @@ All property values are Integers.
:--- | :--- | :--- |
| shards_stats | Object | See [Shard stats](#shard-stats). |
| stats | Object | See [Snapshot file stats](#snapshot-file-stats). |
| shards | list of Objects | List of objects containing information about the shards that include the snapshot. OpenSearch returns the following properties about the shards. <br /><br /> **stage**: Current state of shards in the snapshot. Shard states are: <br /><br /> * DONE: Number of shards in the snapshot that were successfully stored in the repository. <br /><br /> * FAILURE: Number of shards in the snapshot that were not successfully stored in the repository. <br /><br /> * FINALIZE: Number of shards in the snapshot that are in the finalizing stage of being stored in the repository. <br /><br />* INIT: Number of shards in the snapshot that are in the initializing stage of being stored in the repository.<br /><br />* STARTED: Number of shards in the snapshot that are in the started stage of being stored in the repository.<br /><br /> **stats**: See [Snapshot file stats](#snapshot-file-stats). <br /><br /> **total**: Total number and size of files referenced by the snapshot. <br /><br /> **start_time_in_millis**: Time (in milliseconds) when snapshot creation began. <br /><br /> **time_in_millis**: Total time (in milliseconds) that the snapshot took to complete. |
| shards | List of objects | Contains information about the shards included in the snapshot. OpenSearch returns the following properties about the shard: <br /><br /> **stage**: The current state of shards in the snapshot. Shard states are: <br /><br /> * DONE: The number of shards in the snapshot that were successfully stored in the repository. <br /><br /> * FAILURE: The number of shards in the snapshot that were not successfully stored in the repository. <br /><br /> * FINALIZE: The number of shards in the snapshot that are in the finalizing stage of being stored in the repository. <br /><br />* INIT: The number of shards in the snapshot that are in the initializing stage of being stored in the repository.<br /><br />* STARTED: The number of shards in the snapshot that are in the started stage of being stored in the repository.<br /><br /> **stats**: See [Snapshot file stats](#snapshot-file-stats). <br /><br /> **total**: The total number and sizes of files referenced by the snapshot. <br /><br /> **start_time_in_millis**: The time (in milliseconds) when snapshot creation began. <br /><br /> **time_in_millis**: The total amount of time (in milliseconds) that the snapshot took to complete. |
175 changes: 175 additions & 0 deletions _im-plugin/index-context.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
---
layout: default
title: Index context
nav_order: 14
redirect_from:
- /opensearch/index-context/
---

# Index context

This is an experimental feature and is not recommended for use in a production environment. For updates on the progress the feature or if you want to leave feedback, join the discussion on the [OpenSearch forum](https://forum.opensearch.org/).
{: .warning}

Index context declares the use case for an index. Using the context information, OpenSearch applies a predetermined set of settings and mappings, which provides the following benefits:

- Optimized performance
- Settings tuned to your specific use case
- Accurate mappings and aliases based on [OpenSearch Integrations]({{site.url}}{{site.baseurl}}/integrations/)

The settings and metadata configuration that are applied using component templates are automatically loaded when your cluster starts. Component templates that start with `@abc_template@` or Application-Based Configuration (ABC) templates can only be used through a `context` object declaration, in order to prevent configuration issues.
{: .warning}


## Installation

To install the index context feature:

1. Install the `opensearch-system-templates` plugin on all nodes in your cluster using one of the [installation methods]({{site.url}}{{site.baseurl}}/install-and-configure/plugins/#install).

2. Set the feature flag `opensearch.experimental.feature.application_templates.enabled` to `true`. For more information about enabling and disabling feature flags, see [Enabling experimental features]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/experimental/).

3. Set the `cluster.application_templates.enabled` setting to `true`. For instructions on how to configure OpenSearch, see [configuring settings]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index/#static-settings).

## Using the `context` setting

Use the `context` setting with the Index API to add use-case-specific context.

### Considerations

Consider the following when using the `context` parameter during index creation:

1. If you use the `context` parameter to create an index, you cannot include any settings declared in the index context during index creation or dynamic settings updates.
2. The index context becomes permanent when set on an index or index template.

When you adhere to these limitations, suggested configurations or mappings are uniformly applied on indexed data within the specified context.

### Examples

The following examples show how to use index context.


#### Create an index

The following example request creates an index in which to store metric data by declaring a `metrics` mapping as the context:

```json
PUT /my-metrics-index
{
"context": {
"name": "metrics"
}
}
```
{% include copy-curl.html %}

After creation, the context is added to the index and the corresponding settings are applied:


**GET request**

```json
GET /my-metrics-index
```
{% include copy-curl.html %}


**Response**

```json
{
"my-metrics-index": {
"aliases": {},
"mappings": {},
"settings": {
"index": {
"codec": "zstd_no_dict",
"refresh_interval": "60s",
"number_of_shards": "1",
"provided_name": "my-metrics-index",
"merge": {
"policy": "log_byte_size"
},
"context": {
"created_version": "1",
"current_version": "1"
},
...
}
},
"context": {
"name": "metrics",
"version": "_latest"
}
}
}
```


#### Create an index template

You can also use the `context` parameter when creating an index template. The following example request creates an index template with the context information as `logs`:

```json
PUT _index_template/my-logs
{
"context": {
"name": "logs",
"version": "1"
},
"index_patterns": [
"my-logs-*"
]
}
```
{% include copy-curl.html %}

All indexes created using this index template will get the metadata provided by the associated component template. The following request and response show how `context` is added to the template:

**Get index template**

```json
GET _index_template/my-logs
```
{% include copy-curl.html %}

**Response**

```json
{
"index_templates": [
{
"name": "my-logs2",
"index_template": {
"index_patterns": [
"my-logs1-*"
],
"context": {
"name": "logs",
"version": "1"
}
}
}
]
}
```

If there is any conflict between any settings, mappings, or aliases directly declared by your template and the backing component template for the context, the latter gets higher priority during index creation.


## Available context templates

The following templates are available to be used through the `context` parameter as of OpenSearch 2.17:

- `logs`
- `metrics`
- `nginx-logs`
- `amazon-cloudtrail-logs`
- `amazon-elb-logs`
- `amazon-s3-logs`
- `apache-web-logs`
- `k8s-logs`

For more information about these templates, see the [OpenSearch system templates repository](https://github.com/opensearch-project/opensearch-system-templates/tree/main/src/main/resources/org/opensearch/system/applicationtemplates/v1).

To view the current version of these templates on your cluster, use `GET /_component_template`.
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@ Availability and recovery settings include settings for the following:
- [Shard indexing backpressure](#shard-indexing-backpressure-settings)
- [Segment replication](#segment-replication-settings)
- [Cross-cluster replication](#cross-cluster-replication-settings)
- [Workload management](#workload-management-settings)

To learn more about static and dynamic settings, see [Configuring OpenSearch]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index/).

Expand Down Expand Up @@ -71,7 +70,3 @@ For information about segment replication backpressure settings, see [Segment re
## Cross-cluster replication settings

For information about cross-cluster replication settings, see [Replication settings]({{site.url}}{{site.baseurl}}/tuning-your-cluster/replication-plugin/settings/).

## Workload management settings

Workload management is a mechanism that allows administrators to organize queries into distinct groups. For more information, see [Workload management settings]({{site.url}}{{site.baseurl}}/tuning-your-cluster/availability-and-recovery/workload-management/#workload-management-settings).
Loading

0 comments on commit f664e76

Please sign in to comment.