Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for rule-based anomaly detection and imputation #8202

Merged
merged 39 commits into from
Sep 13, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
942c7d7
Add documentation for rule-based anomaly detection and imputation
kaituo Sep 9, 2024
b2af679
Doc review
vagimeli Sep 10, 2024
e2c656e
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
fe79e71
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
dcbce5a
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
23bcea3
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
2754b3b
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
1a5120d
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
6c3326d
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
614b660
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
3ab815f
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
596adfa
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
bee1f4c
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
f8ee3d9
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
28a6b77
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
c318ece
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
4189083
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
199fbc3
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
595b45a
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
66c48c4
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
d6913fb
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
45443b9
Update _observing-your-data/ad/index.md
vagimeli Sep 11, 2024
cfc3709
Update _observing-your-data/ad/result-mapping.md
vagimeli Sep 11, 2024
8a3b25d
Update _observing-your-data/ad/index.md
vagimeli Sep 12, 2024
bc9488a
Update _observing-your-data/ad/index.md
vagimeli Sep 12, 2024
50eff8b
Update _observing-your-data/ad/index.md
vagimeli Sep 12, 2024
2c2e06c
Update _observing-your-data/ad/index.md
vagimeli Sep 12, 2024
894efee
Update index.md
vagimeli Sep 13, 2024
5738739
Update result-mapping.md
vagimeli Sep 13, 2024
14dc454
Update _observing-your-data/ad/index.md
vagimeli Sep 13, 2024
4ad9e02
Update _observing-your-data/ad/index.md
vagimeli Sep 13, 2024
4d7f738
Merge branch 'main' into 2.17
vagimeli Sep 13, 2024
9afca30
Fix links
vagimeli Sep 13, 2024
0067b5d
Fix links
vagimeli Sep 13, 2024
a99969b
Address editorial feedback
vagimeli Sep 13, 2024
7ea3d63
Address editorial feedback
vagimeli Sep 13, 2024
4b42bc2
Merge branch 'main' into 2.17
vagimeli Sep 13, 2024
f9434ec
Merge branch 'main' into 2.17
vagimeli Sep 13, 2024
ca49c0c
Update _observing-your-data/ad/index.md
vagimeli Sep 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 40 additions & 1 deletion _observing-your-data/ad/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,8 @@
- (Optional) To add extra processing time for data collection, specify a **Window delay** value.
- This value tells the detector that the data is not ingested into OpenSearch in real time but with a certain delay. Set the window delay to shift the detector interval to account for this delay.
- For example, say the detector interval is 10 minutes and data is ingested into your cluster with a general delay of 1 minute. Assume the detector runs at 2:00. The detector attempts to get the last 10 minutes of data from 1:50 to 2:00, but because of the 1-minute delay, it only gets 9 minutes of data and misses the data from 1:59 to 2:00. Setting the window delay to 1 minute shifts the interval window to 1:49--1:59, so the detector accounts for all 10 minutes of the detector interval time.
- To avoid missing any data, set the **Window delay** to the upper bound of the expected ingestion delay. This ensures the detector accounts for all data during its interval, reducing the chances of missing relevant information. While setting a longer window delay helps capture all data, setting it too high can hinder real-time anomaly detection, as the detector will always be looking further back in time. Strike a balance to maintain both data accuracy and timely detection.

1. Specify custom results index.
- The Anomaly Detection plugin allows you to store anomaly detection results in a custom index of your choice. To enable this, select **Enable custom results index** and provide a name for your index, for example, `abc`. The plugin then creates an alias prefixed with `opensearch-ad-plugin-result-` followed by your chosen name, for example, `opensearch-ad-plugin-result-abc`. This alias points to an actual index with a name containing the date and a sequence number, like `opensearch-ad-plugin-result-abc-history-2024.06.12-000002`, where your results are stored.

Expand Down Expand Up @@ -164,7 +166,44 @@

Set the number of aggregation intervals from your data stream to consider in a detection window. It’s best to choose this value based on your actual data to see which one leads to the best results for your use case.

The anomaly detector expects the shingle size to be in the range of 1 and 60. The default shingle size is 8. We recommend that you don't choose 1 unless you have two or more features. Smaller values might increase [recall](https://en.wikipedia.org/wiki/Precision_and_recall) but also false positives. Larger values might be useful for ignoring noise in a signal.
The anomaly detector expects the shingle size to be in the range of 1 and 128. The default shingle size is 8. We recommend that you don't choose 1 unless you have two or more features. Smaller values might increase [recall](https://en.wikipedia.org/wiki/Precision_and_recall) but also false positives. Larger values might be useful for ignoring noise in a signal.

vagimeli marked this conversation as resolved.
Show resolved Hide resolved
#### (Advanced settings) Set an imputation option

The Imputation option allows you to address missing data in your streams. You can choose from the following methods to handle gaps:

- **Ignore Missing Data (Default):** The system continues without factoring in missing data points, maintaining the existing data flow.
- **Fill with Custom Values:** Specify a custom value for each feature to replace missing data points, allowing for targeted imputation tailored to your data.
- **Fill with Zeros:** Replace missing values with zeros, ideal when the absence of data itself indicates a significant event, such as a drop to zero in event counts.
- **Use Previous Values:** Fill gaps with the last observed value, maintaining continuity in your time series data. This method treats missing data as non-anomalous, carrying forward the previous trend.

Check failure on line 178 in _observing-your-data/ad/index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SubstitutionsError] Use 'time-series data' instead of 'time series data'. Raw Output: {"message": "[OpenSearch.SubstitutionsError] Use 'time-series data' instead of 'time series data'.", "location": {"path": "_observing-your-data/ad/index.md", "range": {"start": {"line": 178, "column": 99}}}, "severity": "ERROR"}

Using these options can improve recall in anomaly detection. For instance, if you're monitoring for drops in event counts, including both partial and complete drops, filling missing values with zeros helps detect significant data absences, improving detection recall.

Note: Be cautious when imputing extensively missing data, as excessive gaps can compromise model accuracy. Remember, quality input is critical—poor data quality will lead to poor model performance. You can determine whether a feature value has been imputed using the `feature_imputed` field in the anomaly result index. For more information, see [Anomaly result mapping]({{site.url}}{{site.baseurl}}/monitoring-plugins/ad/result-mapping).

Check failure on line 182 in _observing-your-data/ad/index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SpacingPunctuation] There should be no space before and one space after the punctuation mark in 'index. For'. Raw Output: {"message": "[OpenSearch.SpacingPunctuation] There should be no space before and one space after the punctuation mark in 'index. For'.", "location": {"path": "_observing-your-data/ad/index.md", "range": {"start": {"line": 182, "column": 314}}}, "severity": "ERROR"}

Check failure on line 182 in _observing-your-data/ad/index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.LinksEndSlash] Add a trailing slash to the link '({{site.url}}{{site.baseurl}}/monitoring-plugins/ad/result-mapping)'. Raw Output: {"message": "[OpenSearch.LinksEndSlash] Add a trailing slash to the link '({{site.url}}{{site.baseurl}}/monitoring-plugins/ad/result-mapping)'.", "location": {"path": "_observing-your-data/ad/index.md", "range": {"start": {"line": 182, "column": 372}}}, "severity": "ERROR"}

#### (Advanced settings) Suppressing Anomalies with Threshold-Based Rules

Check failure on line 184 in _observing-your-data/ad/index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] '(Advanced settings) Suppressing Anomalies with Threshold-Based Rules' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] '(Advanced settings) Suppressing Anomalies with Threshold-Based Rules' is a heading and should be in sentence case.", "location": {"path": "_observing-your-data/ad/index.md", "range": {"start": {"line": 184, "column": 6}}}, "severity": "ERROR"}

You can suppress anomalies by setting rules that define acceptable differences between the expected and actual values, either as an absolute value or a relative percentage. This helps reduce false anomalies caused by minor fluctuations, allowing you to focus on significant deviations.

Suppose you want to detect substantial changes in log volume while ignoring small variations that aren't meaningful. Without customized settings, the system might generate false alerts for minor changes, making it difficult to identify true anomalies. By setting suppression rules, you can filter out minor deviations and hone in on genuinely anomalous patterns.

If you want to suppress anomalies for deviations smaller than 30% from the expected value, you can set the following rules:

```
Ignore anomalies for feature logVolume when the actual value is no more than 30% above the expected value.
Ignore anomalies for feature logVolume when the actual value is no more than 30% below the expected value.
```

Note: Ensure that a feature (e.g., logVolume) is properly defined in your model, as suppression rules are tied to specific features.

Check failure on line 197 in _observing-your-data/ad/index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: logVolume. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: logVolume. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_observing-your-data/ad/index.md", "range": {"start": {"line": 197, "column": 36}}}, "severity": "ERROR"}

If you expect that the log volume should differ by at least 10,000 from the expected value before being considered an anomaly, you can set absolute thresholds:

```
Ignore anomalies for feature logVolume when the actual value is no more than 10000 above the expected value.
Ignore anomalies for feature logVolume when the actual value is no more than 10000 below the expected value.
```

If no custom suppression rules are set, the system defaults to a filter that ignores anomalies with deviations of less than 20% from the expected value for each enabled feature.

#### Preview sample anomalies

Expand Down
75 changes: 75 additions & 0 deletions _observing-your-data/ad/result-mapping.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,81 @@ Field | Description
`model_id` | A unique ID that identifies a model. If a detector is a single-stream detector (with no category field), it has only one model. If a detector is a high-cardinality detector (with one or more category fields), it might have multiple models, one for each entity.
`threshold` | One of the criteria for a detector to classify a data point as an anomaly is that its `anomaly_score` must surpass a dynamic threshold. This field records the current threshold.

When the imputation option is enabled, the anomaly result output will include a `feature_imputed` array, indicating whether each feature has been imputed. This information helps you understand which features were modified during the anomaly detection process due to missing data. If no features were imputed, the feature_imputed array will be omitted from the results.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above: "result includes" => "results include"?

In the following example, the feature processing_bytes_max was imputed, as indicated by the `imputed: true` status:

```json
{
"detector_id": "kzcZ43wBgEQAbjDnhzGF",
"schema_version": 5,
"data_start_time": 1635898161367,
"data_end_time": 1635898221367,
"feature_data": [
{
"feature_id": "processing_bytes_max",
"feature_name": "processing bytes max",
"data": 2322
},
{
"feature_id": "processing_bytes_avg",
"feature_name": "processing bytes avg",
"data": 1718.6666666666667
},
{
"feature_id": "processing_bytes_min",
"feature_name": "processing bytes min",
"data": 1375
},
{
"feature_id": "processing_bytes_sum",
"feature_name": "processing bytes sum",
"data": 5156
},
{
"feature_id": "processing_time_max",
"feature_name": "processing time max",
"data": 31198
}
],
"execution_start_time": 1635898231577,
"execution_end_time": 1635898231622,
"anomaly_score": 1.8124904404395776,
"anomaly_grade": 0,
"confidence": 0.9802940756605277,
"entity": [
{
"name": "process_name",
"value": "process_3"
}
],
"model_id": "kzcZ43wBgEQAbjDnhzGF_entity_process_3",
"threshold": 1.2368549346675202,
"feature_imputed": [
{
"feature_id": "processing_bytes_max",
"imputed": true
},
{
"feature_id": "processing_bytes_avg",
"imputed": false
},
{
"feature_id": "processing_bytes_min",
"imputed": false
},
{
"feature_id": "processing_bytes_sum",
"imputed": false
},
{
"feature_id": "processing_time_max",
"imputed": false
}
]
}
```

If an anomaly detector detects an anomaly, the result has the following format:

```json
Expand Down
Loading