Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strict yml fields #3202

Closed
wants to merge 25 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
3b5c309
Fix up lookups and stories
pyth0n1c Nov 7, 2024
ad42cb1
fix baselines and investigations
pyth0n1c Nov 7, 2024
486aafb
remove risk_score field from all
pyth0n1c Nov 7, 2024
18cc9f5
remove update_timestamp from any
pyth0n1c Nov 7, 2024
114fe7b
Merge branch 'develop' into strict_yml_fields
pyth0n1c Nov 12, 2024
5e5511b
Cleanup, mostly removed datamodel fields and risk_score
pyth0n1c Nov 12, 2024
2639559
Branch was auto-updated.
patel-bhavin Nov 14, 2024
d1a6bbb
Branch was auto-updated.
patel-bhavin Nov 14, 2024
62cbaa2
Branch was auto-updated.
patel-bhavin Nov 14, 2024
c5ac6ba
Branch was auto-updated.
patel-bhavin Nov 14, 2024
410548d
Removed required_fields from, well,
pyth0n1c Nov 15, 2024
904df00
Remove context from detections.tags for a number of detections
pyth0n1c Nov 18, 2024
689828a
Merge branch 'develop' into strict_yml_fields
patel-bhavin Nov 19, 2024
494748b
Branch was auto-updated.
patel-bhavin Nov 19, 2024
5bce549
Branch was auto-updated.
patel-bhavin Nov 20, 2024
2cc89dd
Remove extra fields from recently merged detections
pyth0n1c Nov 20, 2024
c90742b
Merge branch 'develop' into strict_yml_fields
patel-bhavin Nov 20, 2024
be8ae99
Clean up new detections by removing required_fields, risk_score, and …
pyth0n1c Nov 27, 2024
04b1cfd
Branch was auto-updated.
patel-bhavin Dec 2, 2024
5651f42
Branch was auto-updated.
patel-bhavin Dec 2, 2024
e60028a
Merge branch 'develop' into strict_yml_fields
pyth0n1c Dec 12, 2024
d307206
remove extra fields
pyth0n1c Dec 12, 2024
d45212a
Branch was auto-updated.
patel-bhavin Dec 16, 2024
f7829fa
Branch was auto-updated.
patel-bhavin Dec 16, 2024
8cadd2c
Branch was auto-updated.
patel-bhavin Dec 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
6 changes: 0 additions & 6 deletions baselines/baseline_of_blocked_outbound_traffic_from_aws.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ version: 1
date: '2018-05-07'
author: Bhavin Patel, Splunk
type: Baseline
datamodel: []
description: This search establishes, on a per-hour basis, the average and the standard
deviation of the number of outbound connections blocked in your VPC flow logs by
each source IP address (IP address of your EC2 instances). Also recorded is the
Expand Down Expand Up @@ -34,9 +33,4 @@ tags:
- Splunk Enterprise
- Splunk Enterprise Security
- Splunk Cloud
required_fields:
- _time
- action
- src_ip
- dest_ip
security_domain: network
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ version: 1
date: '2020-09-07'
author: David Dorsey, Splunk
type: Baseline
datamodel:
- Change
description: This search is used to build a Machine Learning Toolkit (MLTK) model
for how many API calls are performed by each user. By default, the search uses the
last 90 days of data to build the model and the model is rebuilt weekly. The model
Expand Down Expand Up @@ -40,14 +38,10 @@ tags:
- Splunk Enterprise
- Splunk Enterprise Security
- Splunk Cloud
required_fields:
- _time
- All_Changes.user
- All_Changes.status
security_domain: network
deployment:
scheduling:
cron_schedule: 0 2 * * 0
earliest_time: -90d@d
latest_time: -1d@d
schedule_window: auto
schedule_window: auto
16 changes: 4 additions & 12 deletions baselines/baseline_of_cloud_instances_destroyed.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ version: 1
date: '2020-08-25'
author: David Dorsey, Splunk
type: Baseline
datamodel:
- Change
description: This search is used to build a Machine Learning Toolkit (MLTK) model
for how many instances are destroyed in the environment. By default, the search
uses the last 90 days of data to build the model and the model is rebuilt weekly.
Expand All @@ -20,17 +18,16 @@ search: '| tstats count as instances_destroyed from datamodel=Change where All_C
<= 5, 0, 1) | table _time instances_destroyed, HourOfDay, isWeekend | fit DensityFunction
instances_destroyed by "HourOfDay,isWeekend" into cloud_excessive_instances_destroyed_v1
dist=expon show_density=true'
how_to_implement: 'You must have Enterprise Security 6.0 or later, if not you will
how_to_implement: "You must have Enterprise Security 6.0 or later, if not you will
need to verify that the Machine Learning Toolkit (MLTK) version 4.2 or later is
installed, along with any required dependencies. Depending on the number of users
in your environment, you may also need to adjust the value for max_inputs in the
MLTK settings for the DensityFunction algorithm, then ensure that the search completes
in a reasonable timeframe. By default, the search builds the model using the past
30 days of data. You can modify the search window to build the model over a longer
period of time, which may give you better results. You may also want to periodically
re-run this search to rebuild the model with the latest data.

More information on the algorithm used in the search can be found at `https://docs.splunk.com/Documentation/MLApp/4.2.0/User/Algorithms#DensityFunction`.'
re-run this search to rebuild the model with the latest data.\nMore information
on the algorithm used in the search can be found at `https://docs.splunk.com/Documentation/MLApp/4.2.0/User/Algorithms#DensityFunction`."
known_false_positives: none
references: []
tags:
Expand All @@ -43,15 +40,10 @@ tags:
- Splunk Enterprise
- Splunk Enterprise Security
- Splunk Cloud
required_fields:
- _time
- All_Changes.action
- All_Changes.status
- All_Changes.object_category
security_domain: network
deployment:
scheduling:
cron_schedule: 0 2 * * 0
earliest_time: -90d@d
latest_time: -1d@d
schedule_window: auto
schedule_window: auto
16 changes: 4 additions & 12 deletions baselines/baseline_of_cloud_instances_launched.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ version: 1
date: '2020-08-14'
author: David Dorsey, Splunk
type: Baseline
datamodel:
- Change
description: This search is used to build a Machine Learning Toolkit (MLTK) model
for how many instances are created in the environment. By default, the search uses
the last 90 days of data to build the model and the model is rebuilt weekly. The
Expand All @@ -20,17 +18,16 @@ search: '| tstats count as instances_launched from datamodel=Change where (All_C
<= 5, 0, 1) | table _time instances_launched, HourOfDay, isWeekend | fit DensityFunction
instances_launched by "HourOfDay,isWeekend" into cloud_excessive_instances_created_v1
dist=expon show_density=true'
how_to_implement: 'You must have Enterprise Security 6.0 or later, if not you will
how_to_implement: "You must have Enterprise Security 6.0 or later, if not you will
need to verify that the Machine Learning Toolkit (MLTK) version 4.2 or later is
installed, along with any required dependencies. Depending on the number of users
in your environment, you may also need to adjust the value for max_inputs in the
MLTK settings for the DensityFunction algorithm, then ensure that the search completes
in a reasonable timeframe. By default, the search builds the model using the past
90 days of data. You can modify the search window to build the model over a longer
period of time, which may give you better results. You may also want to periodically
re-run this search to rebuild the model with the latest data.

More information on the algorithm used in the search can be found at `https://docs.splunk.com/Documentation/MLApp/4.2.0/User/Algorithms#DensityFunction`.'
re-run this search to rebuild the model with the latest data.\nMore information
on the algorithm used in the search can be found at `https://docs.splunk.com/Documentation/MLApp/4.2.0/User/Algorithms#DensityFunction`."
known_false_positives: none
references: []
tags:
Expand All @@ -43,15 +40,10 @@ tags:
- Splunk Enterprise
- Splunk Enterprise Security
- Splunk Cloud
required_fields:
- _time
- All_Changes.action
- All_Changes.status
- All_Changes.object_category
security_domain: network
deployment:
scheduling:
cron_schedule: 0 2 * * 0
earliest_time: -90d@d
latest_time: -1d@d
schedule_window: auto
schedule_window: auto
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ version: 1
date: '2020-09-07'
author: David Dorsey, Splunk
type: Baseline
datamodel:
- Change
description: This search is used to build a Machine Learning Toolkit (MLTK) model
for how many API calls for security groups are performed by each user. By default,
the search uses the last 90 days of data to build the model and the model is rebuilt
Expand Down Expand Up @@ -39,15 +37,10 @@ tags:
- Splunk Enterprise
- Splunk Enterprise Security
- Splunk Cloud
required_fields:
- _time
- All_Changes.user
- All_Changes.status
- All_Changes.object_category
security_domain: network
deployment:
scheduling:
cron_schedule: 0 2 * * 0
earliest_time: -90d@d
latest_time: -1d@d
schedule_window: auto
schedule_window: auto
10 changes: 2 additions & 8 deletions baselines/baseline_of_command_line_length___mltk.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ version: 1
date: '2019-05-08'
author: Rico Valdez, Splunk
type: Baseline
datamodel: []
description: This search is used to build a Machine Learning Toolkit (MLTK) model
to characterize the length of the command lines observed for each user in the environment.
By default, the search uses the last 30 days of data to build the model. The model
Expand All @@ -24,7 +23,8 @@ how_to_implement: You must be ingesting endpoint data and populating the Endpoin
the past 30 days of data. You can modify the search window to build the model over
a longer period of time, which may give you better results. You may also want to
periodically re-run this search to rebuild the model with the latest data. More
information on the algorithm used in the search can be found at `https://docs.splunk.com/Documentation/MLApp/4.2.0/User/Algorithms#DensityFunction`.
information on the algorithm used in the search can be found at
`https://docs.splunk.com/Documentation/MLApp/4.2.0/User/Algorithms#DensityFunction`.
known_false_positives: none
references: []
tags:
Expand All @@ -41,12 +41,6 @@ tags:
- Splunk Enterprise
- Splunk Enterprise Security
- Splunk Cloud
required_fields:
- _time
- Processes.user
- Processes.dest
- Processes.process_name
- Processes.process
security_domain: endpoint
deployment:
scheduling:
Expand Down
9 changes: 2 additions & 7 deletions baselines/baseline_of_dns_query_length___mltk.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ version: 1
date: '2019-05-08'
author: Rico Valdez, Splunk
type: Baseline
datamodel:
- Network_Resolution
description: This search is used to build a Machine Learning Toolkit (MLTK) model
to characterize the length of the DNS queries for each DNS record type observed
in the environment. By default, the search uses the last 30 days of data to build
Expand All @@ -22,7 +20,8 @@ how_to_implement: To successfully implement this search, you will need to ensure
days of data. You can modify the search window to build the model over a longer
period of time, which may give you better results. You may also want to periodically
re-run this search to rebuild the model with the latest data. More information on
the algorithm used in the search can be found at `https://docs.splunk.com/Documentation/MLApp/4.2.0/User/Algorithms#DensityFunction`.
the algorithm used in the search can be found at
`https://docs.splunk.com/Documentation/MLApp/4.2.0/User/Algorithms#DensityFunction`.
known_false_positives: none
references: []
tags:
Expand All @@ -36,10 +35,6 @@ tags:
- Splunk Enterprise
- Splunk Enterprise Security
- Splunk Cloud
required_fields:
- _time
- DNS.query
- DNS.record_type
security_domain: network
deployment:
scheduling:
Expand Down
61 changes: 32 additions & 29 deletions baselines/baseline_of_kubernetes_container_network_io.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,29 +4,37 @@ version: 4
date: '2024-09-24'
author: Matthew Moore, Splunk
type: Baseline
datamodel: []
description: This baseline rule calculates the average and standard deviation of inbound and outbound network IO for each Kubernetes container.
It uses metrics from the Kubernetes API and the Splunk Infrastructure Monitoring Add-on. The rule generates a lookup table with the average and
standard deviation of the network IO for each container. This baseline can be used to detect anomalies in network communication behavior,
which may indicate security threats such as data exfiltration, command and control communication, or compromised container behavior.
search: '| mstats avg(k8s.pod.network.io) as io where `kubernetes_metrics` by k8s.cluster.name k8s.pod.name k8s.node.name direction span=10s
| eval service = replace(''k8s.pod.name'', "-\w{5}$|-[abcdef0-9]{8,10}-\w{5}$", "")
| eval key = ''k8s.cluster.name'' + ":" + ''service''
| stats avg(eval(if(direction="transmit", io,null()))) as avg_outbound_network_io avg(eval(if(direction="receive", io,null()))) as avg_inbound_network_io
stdev(eval(if(direction="transmit", io,null()))) as stdev_outbound_network_io stdev(eval(if(direction="receive", io,null()))) as stdev_inbound_network_io
count latest(_time) as last_seen by key
| outputlookup k8s_container_network_io_baseline'
how_to_implement: 'To implement this detection, follow these steps:
1. Deploy the OpenTelemetry Collector (OTEL) to your Kubernetes cluster.
2. Enable the hostmetrics/process receiver in the OTEL configuration.
3. Ensure that the process metrics, specifically Process.cpu.utilization and process.memory.utilization, are enabled.
4. Install the Splunk Infrastructure Monitoring (SIM) add-on (ref: https://splunkbase.splunk.com/app/5247)
5. Configure the SIM add-on with your Observability Cloud Organization ID and Access Token.
6. Set up the SIM modular input to ingest Process Metrics. Name this input "sim_process_metrics_to_metrics_index".
7. In the SIM configuration, set the Organization ID to your Observability Cloud Organization ID.
8. Set the Signal Flow Program to the following: data(''process.threads'').publish(label=''A''); data(''process.cpu.utilization'').publish(label=''B''); data(''process.cpu.time'').publish(label=''C''); data(''process.disk.io'').publish(label=''D''); data(''process.memory.usage'').publish(label=''E''); data(''process.memory.virtual'').publish(label=''F''); data(''process.memory.utilization'').publish(label=''G''); data(''process.cpu.utilization'').publish(label=''H''); data(''process.disk.operations'').publish(label=''I''); data(''process.handles'').publish(label=''J''); data(''process.threads'').publish(label=''K'')
9. Set the Metric Resolution to 10000.
10. Leave all other settings at their default values.'
description: This baseline rule calculates the average and standard deviation of inbound
and outbound network IO for each Kubernetes container. It uses metrics from the
Kubernetes API and the Splunk Infrastructure Monitoring Add-on. The rule generates
a lookup table with the average and standard deviation of the network IO for each
container. This baseline can be used to detect anomalies in network communication
behavior, which may indicate security threats such as data exfiltration, command
and control communication, or compromised container behavior.
search: "| mstats avg(k8s.pod.network.io) as io where `kubernetes_metrics` by k8s.cluster.name
k8s.pod.name k8s.node.name direction span=10s | eval service = replace('k8s.pod.name',
\"-\\w{5}$|-[abcdef0-9]{8,10}-\\w{5}$\", \"\") | eval key = 'k8s.cluster.name' +
\":\" + 'service' | stats avg(eval(if(direction=\"transmit\", io,null()))) as avg_outbound_network_io
avg(eval(if(direction=\"receive\", io,null()))) as avg_inbound_network_io stdev(eval(if(direction=\"\
transmit\", io,null()))) as stdev_outbound_network_io stdev(eval(if(direction=\"\
receive\", io,null()))) as stdev_inbound_network_io count latest(_time) as last_seen
by key | outputlookup k8s_container_network_io_baseline"
how_to_implement: "To implement this detection, follow these steps: 1. Deploy the
OpenTelemetry Collector (OTEL) to your Kubernetes cluster. 2. Enable the hostmetrics/process
receiver in the OTEL configuration. 3. Ensure that the process metrics, specifically
Process.cpu.utilization and process.memory.utilization, are enabled. 4. Install
the Splunk Infrastructure Monitoring (SIM) add-on (ref: https://splunkbase.splunk.com/app/5247)
5. Configure the SIM add-on with your Observability Cloud Organization ID and Access
Token. 6. Set up the SIM modular input to ingest Process Metrics. Name this input
\"sim_process_metrics_to_metrics_index\". 7. In the SIM configuration, set the Organization
ID to your Observability Cloud Organization ID. 8. Set the Signal Flow Program to
the following: data('process.threads').publish(label='A'); data('process.cpu.utilization').publish(label='B');
data('process.cpu.time').publish(label='C'); data('process.disk.io').publish(label='D');
data('process.memory.usage').publish(label='E'); data('process.memory.virtual').publish(label='F');
data('process.memory.utilization').publish(label='G'); data('process.cpu.utilization').publish(label='H');
data('process.disk.operations').publish(label='I'); data('process.handles').publish(label='J');
data('process.threads').publish(label='K') 9. Set the Metric Resolution to 10000.
10. Leave all other settings at their default values."
known_false_positives: none
references: []
tags:
Expand All @@ -38,15 +46,10 @@ tags:
- Splunk Enterprise
- Splunk Enterprise Security
- Splunk Cloud
required_fields:
- k8s.pod.network.io
- k8s.cluster.name
- k8s.node.name
- k8s.pod.name
security_domain: network
deployment:
scheduling:
cron_schedule: 0 2 * * 0
earliest_time: -30d@d
latest_time: -1d@d
schedule_window: auto
schedule_window: auto
Loading
Loading