Skip to content

Haystack: Anomaly detection and alerting

kpeswani edited this page Feb 21, 2019 · 8 revisions

High Level Architecture

The diagram below shows high level overview of haystack integration with expedia's open-source adaptive-alerting and alert-management system.

Ideology

The ideology for the above design is to decouple haystack with alerting system, so that any alerting system can be integrated with haystack as per the needs. Only the integration sub-system within haystack needs to be replaced.

To incorporate the ideology, the sub-system should be based on following principles :

  • The sub-system should have a mapper to map haystack-trends to a format understood by alerting system.
  • The sub-system should implement an interface for anomalies to be queried by Haystack UI.

Adaptive Alerting Integration

There will be a sub-system Haystack-Alerting. The responsibility of the sub-system is

  • Map the trends produced by haystack-trends into a format understood by the adaptive-alerting system. Haystack-trends will produce data in metrics 2.0 format and adaptive-alerting system consumes data in the same format.Hence, the mapper is not required and adaptive-alerting can directly consume trends from metrics topic. But, keeping this bridge would enable haystack to replace adaptive-alerting system with another one.
  • Implement an interface for Haystack UI to interact with subscription management feature of the Alert management. This would enable haystack consumers to subscribe on alerts.
  • Store the anomalies produced by Adaptive-Alerting into a persistent store such as ElasticSearch and implement an interface for Haystack UI to query anomalies so that they can be shown on UI.

Flow of data

Haystack-trends will produce trends in metrics topic. The adaptive-alerting system will consume the trends to produce anomalies. The anomalies will then be consumed by anomaly-store sub-system within haystack-alerting system and stored in elasticsearch. An api store called alert-api will act as an interface for Haystack UI to query anomalies from elasticsearch and do subscription management.

Anomaly vs Alert

A anomaly is a deviation from normal or expected behaviour. But, not all anomalies are alerts. An alert is an anomaly on which some action needs to be taken. For eg: we might only notify owners of the data about anomalous behaviour in a service only in case of an alert. An alert can be defined as a root cause of the anomalous behaviour of several services.