Siembol alert is a detection engine used to filter matching events from an incoming data stream based on a configurable rule set. The correlation alert allows you to group several detections together before raising an alert.
The fields that are common to alert and correlation alert.
rule_name
- Rule name that uniquely identifies the rulerule_author
- The author of the rule - the user who last modified the rulerule_version
- The version of the rulerule_description
- This field contains a single text input that allows you set a description for the alert. This should be a short, helpful comment that allows anyone to identify the purpose of this alert.tags
- Tags are optional but recommended as they allow you to add tags to the event after matching the rule. Each tag is a key-value pair. Both the key and the value inputs are completely free form allowing you to tag your rules in the way which works best for your organisation. You can use substitution in the value input to set the tag value equal to the value of a field from the event. The syntax for this is${field_name}
tag_name
- The name of the tagtag_value
- The value of the tag.
Note: if you want to correlate an alert in correlation engine that use the tag with name "correlation_key". This alert will be silent if you do not set the tag with name "correlation_alert_visible"
rule_protection
- Rule Protection allows you to prevent a noisy alert from flooding the components downstream. You can set the maximum number of times an alert can fire per hour and per day. If either limit is exceeded then any event that matches is sent to error instead of output topic until the threshold is reset. Rule Protection is optional. If it is not configured for a rule, the rule will get the global defaults applied.max_per_hour
- Maximum alerts allowed per hourmax_per_day
- Maximum alerts allowed per day
source_type
- This fields allows you to determine the type of data you want to match on. It is essentially a matcher for the "source_type" field. This field does not support regex - however, using*
as an input matches all source types. Thesource_type
field is set during parsing and is equal to the name of the last parser which was used to parse the log.
Tip: if you want to match on multiple data sources, set the source type to be * and add a regex matcher (in the matcher section) to filter down to your desired source types.
Matchers allow you to select the events you want the rule to alert on.
is_enabled
- The matcher is enableddescription
- The description of the matchermatcher_type
- Type of matcher, eitherREGEX_MATCH
,IS_IN_SET
,CONTAINS
,NUMERIC_COMPARE
,COMPOSITE_AND
orCOMPOSITE_OR
is_negated
- The matcher is negated private Boolean negated = false;field
- The name of the field on which the matcher will be evaluated
There are four types of matchers:
REGEX_MATCH
- A regex_match allows you use a regex statement to match a specified field. There are two string inputs:data
: the regex statement in Java using syntax from https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html except allowing to use underscores in the names of captured groups Named capture groups in the regex are added as fields in the event. They are available from the next matcher onwards and are included in the output event.
IS_IN_SET
- An "is_in_set" matcher compares the value of a field to a set of strings defined indata
. If the value is in the set then the matcher returns true.data
- A list of strings to compare the value to. New line delimited. Does not support regex - each line must be a literal match however, field substitution is supported in this field.case_insensitive
- Use case-insensitive string compare
CONTAINS
- A "contains" matcher searches for substring defined indata
. If the pattern is found the matcher returns true.data
- A pattern to search. Field substitution is supported in this field.case_insensitive
- Use case-insensitive string comparestarts_with
- The field value starts with the patternends_with
- The field value ends with the pattern
NUMERIC_COMPARE
- A matcher compares field numeric value with the expression using various comparing typescompare_type
- The type of comparing numbers, eitherequal
,lesser_equal
,lesser
,greater
orgreater_equal
expression
- A field numeric value will be compared with the expression. The expression can be a numeric constant or a string that contains a variable.
COMPOSITE_AND
- Used to combine matchers from the listmatchers
with AND logic operationCOMPOSITE_OR
- Used to combine matchers from the listmatchers
with OR logic operation
Note : A composite matcher is recursive in alerting engine, however the level of recursion is limited to 3 in Siembol UI for simplicity
Global tags and global rule protection are defined in the deployment of the rules. These are added to the alert after matching unless are overridden by individual rule settings. The global tag with the name detection_source
is used to identify the detection engine that triggers the alert.
The correlation alert allows you to group several detections together before raising an alert. The primary use case for this is when you have a group of detections which individually shouldn't be alerted on (e.g. high volume or detections with high false positive rate) you can group several together to get more reliable alerts.
correlation_attributes
field allows you to configure which detections to correlate together.
time_unit
- A field that allows you to configure the time unit to use, this is a fixed option with the choices:hours
minutes
seconds
time_window
- A field to set the time window in the selected time unit for the correlationtime_computation_type
- You can configure how the time window is calculatedevent_time
- The time window is calculated using thetimestamp
field in the events, thetimestamp
field is usually computed during parsing from the logprocessing_time
- The time window is calculated using the current time (when an alert is evaluated), the events need to be processed by the correlation alert component within the time window
max_time_lag_in_sec
- The event with timestamp older than the current time minus the lag (in seconds) will be discardedalerts_threshold
- The alert's threshold allows you to configure how many detections (you can specify which detections later) need to trigger in the time window for the alert to trigger. This field accepts an integer value, if it is left empty then all detections need to trigger before an alert is createdalerts
- The list of alerts for correlationalert
- The alert name used for correlationthreshold
- The number of times the alert has to trigger in the time windowmandatory
- The alert must pass the threshold for the rule to match
fields_to_send
- The list of fields of correlated alerts that will be included in the triggered alert after matching
alerts.topology.name
- The name of storm topologyalerts.input.topics
- The list of kafka input topics for reading messageskafka.error.topic
- The kafka error topic for error messagesalerts.output.topic
- The kafka output topic for publishing alertsalerts.correlation.output.topic
- The kafka topic for alerts used for correlation by correlation ruleskafka.producer.properties
- Defines kafka producer properties, see https://kafka.apache.org/0102/documentation.html#producerconfigszookeeper.attributes
- The zookeeper attributes for updating the ruleszk.url
- Zookeeper servers url. Multiple servers are separated by commazk.path
- Path to a zookeeper node or multiple nodes delimited by new line. Alerting rules from multiple zookeeper nodes can be loaded in order to save storm resources
storm.attributes
- Storm attributes for the enrichment topologybootstrap.servers
- Kafka brokers servers url. Multiple servers are separated by commafirst.pool.offset.strategy
- Defines how the kafka spout seeks the offset to be used in the first poll to kafkakafka.spout.properties
- Defines kafka consumer attributes for kafka spout such asgroup.id
,protocol
, see https://kafka.apache.org/0102/documentation.html#consumerconfigspoll.timeout.ms
- Kafka consumer parameterpoll.timeout.ms
used in kafka spoutoffset.commit.period.ms
- Specifies the period of time (in milliseconds) after which the spout commits to Kafka, see https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/storm-moving-data/content/tuning_kafkaspout_performance.htmlmax.uncommitted.offsets
- Defines the maximum number of polled offsets (records) that can be pending commit before another poll can take placestorm.config
- Defines storm attributes for a topology, see https://storm.apache.org/releases/current/Configuration.html
kafka.spout.num.executors
- The number of executors for reading from kafka input topicalerts.engine.bolt.num.executors
- The number of executors for evaluating alerting ruleskafka.writer.bolt.num.executors
- The number of executors for producing alerts to output topic
alerts.engine
- This field should be set tosiembol_alerts
alerts.engine
- This field should be set tosiembol_correlation_alerts
alerts.engine.clean.interval.sec
- The period in seconds for regular cleaning a rule correlation data that are not needed for the further rule evaluation