Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for Indices Request Cache Overview and its settings #7288

Merged
merged 18 commits into from
Jun 14, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 18 additions & 4 deletions _install-and-configure/configuring-opensearch/index-settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,22 @@ To learn more about static and dynamic settings, see [Configuring OpenSearch]({{

## Cluster-level index settings

OpenSearch supports the following cluster-level index settings. All settings in this list are dynamic:
There are two types of cluster settings:

- [Static cluster-level index settings](#static-cluster-level-index-settings) are settings that you cannot update while the cluster is up. To update a static setting, you must stop the cluster, update the setting, and then restart the cluster.
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
- [Dynamic cluster-level index settings](#dynamic-cluster-level-index-settings) are settings that you can update at any time.

### Static cluster-level index settings

OpenSearch supports the following static cluster-level index settings:

- `indices.cache.cleanup_interval` (Time unit): Schedules a recurring background task that cleans up expired entries from the cache at the specified interval. Default is `1m` (1 minute). For more information, see [Index request cache]({{site.url}}{{site.baseurl}}/search-plugins/caching/request-cache/).

- `indices.requests.cache.size` (String): The cache size as a percentage of the heap size (for example, to use 1% of the heap, specify `1%`). Default is `1%`. For more information, see [Index request cache]({{site.url}}{{site.baseurl}}/search-plugins/caching/request-cache/).

### Dynamic cluster-level index settings

OpenSearch supports the following dynamic cluster-level index settings:

- `action.auto_create_index` (Boolean): Automatically creates an index if the index doesn't already exist. Also applies any index templates that are configured. Default is `true`.

Expand Down Expand Up @@ -105,9 +120,6 @@ For `zstd`, `zstd_no_dict`, `qat_lz4`, and `qat_deflate`, you can specify the co

- `index.codec.qatmode` (String): The hardware acceleration mode used for the `qat_lz4` and `qat_deflate` compression codecs. Valid values are `auto` and `hardware`. For more information, see [Index codec settings]({{site.url}}{{site.baseurl}}/im-plugin/index-codecs/). Optional. Default is `auto`.




- `index.routing_partition_size` (Integer): The number of shards a custom routing value can go to. Routing helps an imbalanced cluster by relocating values to a subset of shards rather than a single shard. To enable routing, set this value to greater than 1 but less than `index.number_of_shards`. Default is 1.

- `index.soft_deletes.retention_lease.period` (Time unit): The maximum amount of time to retain a shard's history of operations. Default is `12h`.
Expand Down Expand Up @@ -203,6 +215,8 @@ OpenSearch supports the following dynamic index-level index settings:

- `index.query.max_nested_depth` (Integer): The maximum number of nesting levels for `nested` queries. Default is `Integer.MAX_VALUE`. Minimum is 1 (single `nested` query).

- `index.requests.cache.enable` (Boolean): Enables or disables the index request cache. Default is `true`. For more information, see [Index request cache]({{site.url}}{{site.baseurl}}/search-plugins/caching/request-cache/).

- `index.routing.allocation.enable` (String): Specifies options for the index’s shard allocation. Available options are `all` (allow allocation for all shards), `primaries` (allow allocation only for primary shards), `new_primaries` (allow allocation only for new primary shards), and `none` (do not allow allocation). Default is `all`.

- `index.routing.rebalance.enable` (String): Enables shard rebalancing for the index. Available options are `all` (allow rebalancing for all shards), `primaries` (allow rebalancing only for primary shards), `replicas` (allow rebalancing only for replicas), and `none` (do not allow rebalancing). Default is `all`.
Expand Down
2 changes: 1 addition & 1 deletion _search-plugins/caching/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Understanding how your data uses the cache can help improve your cluster's perfo

OpenSearch supports the following on-heap cache types:

- **Request cache**: Caches the local results on each shard. This allows frequently used and potentially resource-heavy search requests to return results almost instantaneously.
- [**Index request cache**]({{site.url}}{{site.baseurl}}/search-plugins/caching/request-cache/): Caches the local results on each shard. This allows frequently used and potentially resource-heavy search requests to return results almost instantaneously.
- **Query cache**: Caches common data from similar queries at the shard level. The query cache is more granular than the request cache and can cache data to be reused in different queries.
- **Field data cache**: Caches field data and global ordinals, which are both used to support aggregations on certain field types.

Expand Down
158 changes: 158 additions & 0 deletions _search-plugins/caching/request-cache.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
---
layout: default
title: Index request cache
parent: Caching
grand_parent: Improving search performance
nav_order: 5
---

# Index request cache

The index request cache in OpenSearch is a specialized caching mechanism designed to enhance search performance by storing the results of frequently executed search queries at the shard level. This reduces the load on the cluster and improves response times for repeated searches. This cache is enabled by default and is particularly useful for read-heavy workloads where certain queries are executed frequently.
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved

The cache is automatically invalidated at the configured refresh interval. The invalidation includes document updates (including document deletions) and changes to index settings. This ensures that stale results are never returned from the cache. When the cache size exceeds its configured limit, the least recently used entries are evicted to make room for new entries.

Search requests with `size=0` are cached in the request cache by default. Search requests with non-deterministic characteristics (such as `Math.random()`) or relative times (such as `now` or `new Date()`) are ineligible for caching.
{: .note}

## Configuring request caching

You can configure index request cache by setting the parameters in the `opensearch.yml` configuration file or using the REST API. For more information, see [Index settings]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index-settings/).
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved

### Settings

The following table lists the index request cache settings. For more information about dynamic settings, see [Index settings]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index-settings/).

Setting | Data type | Default | Level | Static/Dynamic | Description
:--- |:-----------|:--------| :--- | :--- | :---
`indices.cache.cleanup_interval` | Time unit | `1m` (1 minute) | Cluster | Static | Schedules a recurring background task that cleans up expired entries from the cache at the specified interval.
`indices.requests.cache.size` | Percentage | `1%` | Cluster | Static | The cache size as a percentage of the heap size (for example, to use 1% of the heap, specify `1%`).
`index.requests.cache.enable` | Boolean | `true` | Index | Dynamic | Enables or disables the request cache.

### Example

To disable the request cache for an index, send the following request:

```json
PUT /my_index/_settings
{
"index.requests.cache.enable": false
}
```
{% include copy-curl.html %}

## Caching specific requests

In addition to providing index-level or cluster-level settings for request cache, you can also cache specific search requests selectively by setting the `request_cache` query parameter to `true`:
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved

```json
GET /students/_search?request_cache=true
{
"query": {
"match": {
"name": "doe john"
}
}
}
```
{% include copy-curl.html %}

## Monitoring the request cache

Monitoring cache usage and performance is crucial for maintaining an efficient caching strategy. OpenSearch provides several APIs to help monitor the cache.
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved

### Retrieving cache statistics for all nodes

The [Nodes Stats API]({{site.url}}{{site.baseurl}}/api-reference/nodes-apis/nodes-stats/) returns cache statistics for all nodes in the cluster:
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved

```json
GET /_nodes/stats/indices/request_cache
```
{% include copy-curl.html %}

The response contains the request cache statistics:

```json
{
"nodes": {
"T7aqO6zaQX-lt8XBWBYLsA": {
"indices": {
"request_cache": {
"memory_size_in_bytes": 10240,
"evictions": 0,
"hit_count": 50,
"miss_count": 10
}
}
}
}
}
```
{% include copy-curl.html %}

### Retrieving cache statistics for a specific index

The [Index Stats API]({{site.url}}{{site.baseurl}}/api-reference/index-apis/stats/) returns cache statistics for a specific index:

```json
GET /my_index/_stats/request_cache
```
{% include copy-curl.html %}

The response contains the request cache statistics:

```json
{
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"_all": {
"primaries": {
"request_cache": {
"memory_size_in_bytes": 2048,
"evictions": 1,
"hit_count": 30,
"miss_count": 5
}
},
"total": {
"request_cache": {
"memory_size_in_bytes": 4096,
"evictions": 2,
"hit_count": 60,
"miss_count": 10
}
}
},
"indices": {
"my_index": {
"primaries": {
"request_cache": {
"memory_size_in_bytes": 2048,
"evictions": 1,
"hit_count": 30,
"miss_count": 5
}
},
"total":{
"request_cache": {
"memory_size_in_bytes": 4096,
"evictions": 2,
"hit_count": 60,
"miss_count": 10
}
}
}
}
}
```

## Best practices

When using index request cache, consider the following best practices:
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved

- **Appropriate cache size**: Configure the cache size based on your query patterns. A larger cache can store more results but may consume significant resources.
- **Query optimization**: Ensure that frequently executed queries are optimized so they can benefit from caching.
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
- **Monitoring**: Regularly monitor cache hit and cache miss rates to understand cache efficiency and make necessary adjustments.
Loading