Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Troubleshoot missing extracted indicators #162

Open
dspruell-i01 opened this issue Oct 23, 2024 · 1 comment
Open

Troubleshoot missing extracted indicators #162

dspruell-i01 opened this issue Oct 23, 2024 · 1 comment
Labels
blocking Requires some external dependency to be met bug threat-intel Issue originated from Threat Intel (TI) team

Comments

@dspruell-i01
Copy link

dspruell-i01 commented Oct 23, 2024

This issue is marked as blocking since automated OSINT collection is partially broken, potentially to a significant level. It has been for some time (initially encountered nearly two years ago or longer).

We notice that some OSINT collected indicators are not currently showing up in our intelligence data stores. Historically, we've noted this as well. The historical observation is that on numerous occasions, checking for indicators we know are present in RSS or Sitemap feeds we collect do not show up in our collections. A previous Engineering resource, Trevor, started diagnosing this and reported seeing indicators being extracted but never reaching outputs. In this case, we see that indicators do not appear to be extracted (and therefore would not be seen in outputs, but should be).

  1. Source RSS feeds are validated as functioning and added to ThreatIngestor's configuration.
  2. ThreatIngestor runs and extracts indicators from feed sources, as validated in ThreatIngestor logs.
  3. Expected indicators do not appear in threat intel data stores, as verified using lookups against our API services to query C2 Feed, IOCDB, REPDB, and TIDB.

Example

Source blog post:

2024-10-22 https://blog.talosintelligence.com/gophish-powerrat-dcrat/

Feed is configured in ThreatIngestor:

- name: rss-talos
  module: rss
  url: http://feeds.feedburner.com/feedburner/Talos
  feed_type: messy

Feed is verified to be functional, and the target post is found in the feed content:

image.png

This sample indicator is listed in the post:

94[.]103[.]85[.]47 (94.103.85.47)

This would be extracted, defanged and sent to configured output(s) by ThreatIngestor.

We have verified that extraction from the configured feed is historically functional:

$ grep "'rss-talos'" /opt/research/logs/threatingestor_rss.log* |grep Processing |grep -v '0 artifacts'
/opt/research/logs/threatingestor_rss.log.4:2024-10-10 19:39:15.145 | DEBUG    | threatingestor.sources:process_element:56 - Processing in source 'rss-talos'
/opt/research/logs/threatingestor_rss.log.4:2024-10-10 19:39:15.229 | DEBUG    | threatingestor:run_once:139 - Processing 12 artifacts from source 'rss-talos' with operator 'threatkb-yara'
/opt/research/logs/threatingestor_rss.log.4:2024-10-10 19:39:15.229 | DEBUG    | threatingestor:run_once:139 - Processing 12 artifacts from source 'rss-talos' with operator 'threatkb-c2'
/opt/research/logs/threatingestor_rss.log.4:2024-10-10 19:39:15.229 | DEBUG    | threatingestor:run_once:139 - Processing 12 artifacts from source 'rss-talos' with operator 'threatkb-task'
/opt/research/logs/threatingestor_rss.log.4:2024-10-10 19:39:15.229 | DEBUG    | threatingestor:run_once:139 - Processing 12 artifacts from source 'rss-talos' with operator 'url-processor'
/opt/research/logs/threatingestor_rss.log.4:2024-10-10 19:39:15.229 | DEBUG    | threatingestor:run_once:139 - Processing 12 artifacts from source 'rss-talos' with operator 'csv'
/opt/research/logs/threatingestor_rss.log.4:2024-10-10 19:39:15.230 | DEBUG    | threatingestor:run_once:139 - Processing 12 artifacts from source 'rss-talos' with operator 'sqlite'
/opt/research/logs/threatingestor_rss.log.4:2024-10-10 19:39:15.266 | DEBUG    | threatingestor:run_once:139 - Processing 12 artifacts from source 'rss-talos' with operator 'aurora'
/opt/research/logs/threatingestor_rss.log.5:2024-10-09 17:08:34.294 | DEBUG    | threatingestor.sources:process_element:56 - Processing in source 'rss-talos'
/opt/research/logs/threatingestor_rss.log.5:2024-10-09 17:08:34.315 | DEBUG    | threatingestor:run_once:139 - Processing 1 artifacts from source 'rss-talos' with operator 'threatkb-yara'
/opt/research/logs/threatingestor_rss.log.5:2024-10-09 17:08:34.315 | DEBUG    | threatingestor:run_once:139 - Processing 1 artifacts from source 'rss-talos' with operator 'threatkb-c2'
/opt/research/logs/threatingestor_rss.log.5:2024-10-09 17:08:34.315 | DEBUG    | threatingestor:run_once:139 - Processing 1 artifacts from source 'rss-talos' with operator 'threatkb-task'
/opt/research/logs/threatingestor_rss.log.5:2024-10-09 17:08:34.315 | DEBUG    | threatingestor:run_once:139 - Processing 1 artifacts from source 'rss-talos' with operator 'url-processor'
/opt/research/logs/threatingestor_rss.log.5:2024-10-09 17:08:34.316 | DEBUG    | threatingestor:run_once:139 - Processing 1 artifacts from source 'rss-talos' with operator 'csv'
/opt/research/logs/threatingestor_rss.log.5:2024-10-09 17:08:34.316 | DEBUG    | threatingestor:run_once:139 - Processing 1 artifacts from source 'rss-talos' with operator 'sqlite'
/opt/research/logs/threatingestor_rss.log.5:2024-10-09 17:08:34.329 | DEBUG    | threatingestor:run_once:139 - Processing 1 artifacts from source 'rss-talos' with operator 'aurora'
/opt/research/logs/threatingestor_rss.log.5:2024-10-10 10:53:47.489 | DEBUG    | threatingestor.sources:process_element:56 - Processing in source 'rss-talos'
/opt/research/logs/threatingestor_rss.log.5:2024-10-10 10:53:47.509 | DEBUG    | threatingestor:run_once:139 - Processing 1 artifacts from source 'rss-talos' with operator 'threatkb-yara'
/opt/research/logs/threatingestor_rss.log.5:2024-10-10 10:53:47.510 | DEBUG    | threatingestor:run_once:139 - Processing 1 artifacts from source 'rss-talos' with operator 'threatkb-c2'
/opt/research/logs/threatingestor_rss.log.5:2024-10-10 10:53:47.510 | DEBUG    | threatingestor:run_once:139 - Processing 1 artifacts from source 'rss-talos' with operator 'threatkb-task'
/opt/research/logs/threatingestor_rss.log.5:2024-10-10 10:53:47.510 | DEBUG    | threatingestor:run_once:139 - Processing 1 artifacts from source 'rss-talos' with operator 'url-processor'
/opt/research/logs/threatingestor_rss.log.5:2024-10-10 10:53:47.510 | DEBUG    | threatingestor:run_once:139 - Processing 1 artifacts from source 'rss-talos' with operator 'csv'
/opt/research/logs/threatingestor_rss.log.5:2024-10-10 10:53:47.510 | DEBUG    | threatingestor:run_once:139 - Processing 1 artifacts from source 'rss-talos' with operator 'sqlite'
/opt/research/logs/threatingestor_rss.log.5:2024-10-10 10:53:47.524 | DEBUG    | threatingestor:run_once:139 - Processing 1 artifacts from source 'rss-talos' with operator 'aurora'

...However note that the above logs show this extraction has only been performed most recently on 2024-10-09 and 2024-10-10. The post was published 2024-10-22. There has been no extraction performed for this configured source since publication (2024-10-22 or 2024-10-23, the day this issue is being reported).

Looking at the configured outputs, it appears that the rss_talos source is configured to output to ThreatKB (C2 Feed):

- name: threatkb-c2
  module: threatkb
  credentials: threatkb-auth
  artifact_types: [Domain, IPAddress]
  # yamllint disable-line
  allowed_sources: [twitter-list-inquest-ioc-feed, rss-paloaltonetworks, rss-talos, rss-securelist, rss-fireeye]
  use_https: true

However, we can confirm that when we checked ThreatKB for the target indicator (94.103.85.47), it was not found. We added it manually.

We can also confirm that the indicator did not get ingested from this source and routed to a different indicator store. When queried 2024-10-23, the indicator was only ingested from another source (Recorded Future) and routed to TIDB. It was not collected from the Talos feed.

@dspruell-i01 dspruell-i01 added blocking Requires some external dependency to be met bug threat-intel Issue originated from Threat Intel (TI) team labels Oct 23, 2024
@dspruell-i01
Copy link
Author

Clarifying notes from external discussion:

ThreatIngestor logs make it seem like no extraction was performed for that source at all when/after it was published. Based on the logs, we're led to believe that no extraction was performed by ThreatIngestor - not that the indicators were extracted and failed to be ingested.

@pedramamini asked this:

The IOCs are not IN the feed correct? threatingestor never went out to the site to fetch the content. this isn't a bug, but rather a feature need.

Response:

  • The IOCs are in the feed - at that URL, in the section for the post in the screenshot I included in the issue, the full post content is contained in the <content:encoded> tag, and that includes the IOCs.
  • ThreatIngestor has pulled content from the feed URL on at least two days as noted in the logs I included (see issue description), and that includes having extracted indicators from the site on those two days. It's not clear to me why it's not doing it consistently, many times per day though, since it's clear from the logs that ThreatIngestor polls configured sources frequently (most of the time the logging shows 0 artifacts extracted, but it's definitely hitting or trying to hit the sites). The logs indicate that ThreatIngestor did not pull content from the feed URL on the day the post was published nor the day after. That seems like it is likely the issue.

@pedramamini asked this:

How are you loading this site? https://feeds.feedburner.com/feedburner/Talos it has a bunk SSL cert.

Response:

  • feeds.feedburner.com:443 TLS seems fine for me in Firefox and Chromium on Linux, no cert warnings, TLS session establishes, content served cleanly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocking Requires some external dependency to be met bug threat-intel Issue originated from Threat Intel (TI) team
Projects
None yet
Development

No branches or pull requests

1 participant