Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add new namespace "rule.*" #903

Open
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

trisch-me
Copy link
Contributor

Introducing new ECS namespace - Rule

Rule fields are used to capture the specifics of any observer or agent rules that generate alerts or other notable events.

Examples of data sources that would populate the rule fields include: network admission control platforms, network or host IDS/IPS, network firewalls, web application firewalls, url filters, endpoint detection and response (EDR) systems, etc.

Merge requirement checklist

@trisch-me trisch-me requested review from a team April 8, 2024 16:59
@cartersocha
Copy link

Is there any duplication between rule.reference and rule.description? If it's a custom rule I'm not sure what you would reference.

Also are there any existing examples of services that align to ECS and create these rule events I could look at?

@cartersocha
Copy link

Would the action of the rule be relevant to note as well?

@trisch-me
Copy link
Contributor Author

trisch-me commented Apr 17, 2024

@cartersocha rule.description is a description of rule and rule.reference is the link to the rule if it's available. For example if I make a custom rule I wouldn't have rule.reference available but description will provide information about what this rule does

@joaopgrassi
Copy link
Member

I think as we talked in the sig meeting, rule alone seems a very large/generic topic name to me. It can mean a lot of things. Do you plan to refine this more?

@trisch-me
Copy link
Contributor Author

@joaopgrassi I don't remember talking about changing the name. I was under impression we were talking that rule is so generic that we could use it also in another use cases, not only for security

@trisch-me
Copy link
Contributor Author

Would the action of the rule be relevant to note as well

@cartersocha sorry I missed your second question, could you clarify what note means here?

@trisch-me
Copy link
Contributor Author

trisch-me commented Apr 30, 2024

Also I'm not sure about open source usage of rule namespace but we use it internally in Elastic for example

@trisch-me
Copy link
Contributor Author

I have resolved conflicts for this one and it's ready for discussions/approvals

Copy link
Contributor

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing the comments!
It looks good from the attributes perspective. I have a concern with word rule though. It's too broad.
The rule in this PR is focused on the security events(?) while there are many other kinds of rules:

  • Rules on filtering messages (e.g. here)
  • something random on Oracle cloud
  • something random on Azure
  • ...

So is there a more specific name we can give, like:

  • securty.rule
  • alerting.rule
  • ...

@alexvanboxel
Copy link

Looking at the progress of OTel Events, I'm wondering if Rules should not be put on hold. Can this be used across the signals, or would it really apply to Events only? If so, this could be a use case to drive requirements for events.

@trisch-me
Copy link
Contributor Author

thanks @lmolkova
I don't think it's related to the security space only, there is also nothing about it in the brief or name fields. Yes, it is used widely in security domain but it doesn't limit it to it.

@trisch-me
Copy link
Contributor Author

For visibility - we use it in 62 integrations alone not saying about internal usage

@trisch-me trisch-me removed the Stale label Jul 18, 2024
@trisch-me
Copy link
Contributor Author

We have discussed it multiple time during semconv meetings and came to conclusion for the time being add these fields under security sub-namespace. If we will encounter that some fields are generic enough they should be moved up to the generic rule.*.

@trisch-me
Copy link
Contributor Author

@lmolkova @joaopgrassi @open-telemetry/semconv-security-approvers please check it

.chloggen/rule_new.yaml Outdated Show resolved Hide resolved
@trisch-me
Copy link
Contributor Author

@lmolkova @joaopgrassi I have incorporated all discussed changes, ptal

.chloggen/rule_new.yaml Outdated Show resolved Hide resolved
brief: >
The description of the rule generating the event.
examples: ['Block requests to public DNS over HTTPS / TLS protocols']
- id: id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is necessary given that there is a uuid, name and other identifying properties?

if we don't know if it's necessary the best option is postpone adding until we know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those are different, we have discussed it here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand they are different, but what the use-case for having both of them? Would someone actually need all these different layers of ids? Or perhaps having uuid and name is enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in another comment, those are different types of IDs, uuid should be unique across deployments/different agents, but on specific system rule also gets an id, which is unique to that system only, for example saved object id

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned above, the question is whether both of them are necessary. Why do we need two identifiers (local and global)? What the use-case where it's important to have a local identifier in addition to a global one?

If there is no specific use-case where two identifiers are needed, let's not add both of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have talked about usage of these fields within Elastic and found out that previously we had technical behaviour of the system that would autogenerate and use rule.id for every rule, so we had to keep both. Now that limitation is lifted and I will add UUID as important field but leave id for now out

The URL can point to the vendor’s documentation about the rule.
If that’s not available, it can also be a link to a more general page describing this type of alert.
examples: ['https://en.wikipedia.org/wiki/DNS_over_TLS']
- id: ruleset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since ruleset can have multiple properties beyond name (e.g. id), it could make sense to call it security_rule.ruleset.name (or security_rule.set.name)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also is it common to have ruleset? Is it reasonable to assume that all rules in the set have the same version & license?
Would they be properties of the rule or ruleset the rule belongs to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we don't have a usecase for ruleset.id I don't see the reason to create it.

It is common to pack rules into sets, but this doesn't mean that rules are having the same version/license etc. Some rules might change at different points of time while still being a part of the same set.

Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one major concern, which is I think we have a lack of domain-expert approvals (or domain experts) able to evaluate this PR, which I think is why it's been lingering. I outlined two paths forward, but let's talk about this in the next semconv meeting if no further progress is made before then.


# Security Rule

## Security Rule
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I'm a bit late here, I've been following this.

To some extent, it's unclear to me who will be generating this data, so if you can speak to how it's used in Elastic Common Schema and instances of logs that generate this data that could help.

Some major comments:

  • Rule vs Security Rule vs. Policy - it's hard to understand what the name/scope should be here, as it's very broad. E.g. the way this is phrased now, this could apply to OPA, which has a very clear definition of what is a policy and a rule (and I'd argue, a good foundation we can align with ECS as a way forward). I'm not in the details, but I think what you have aligns with rule there, so policy.rule may be a good namespace here, not that I want to litigate all that right now.
  • Namespacing and re-use within Semantic Conventions. I'm afraid we haven't resolved the discussion of embed sufficiently AND I feel like if we did, there'd be a LOT less contention on this PR.
  • The lack of an "Event", "span" or "Metric" defined for these makes it hard for general semconv maintainers (like myself) to understand what the eventual us case will be. However, given where Event stands in semconv, that would also block this PR so I know you want to make progress.
  • A couple nit-picky things, e.g. license being something that I don't understand yet (I'm not a domain expert here, so don't take this as a blocker, just something I think would need to be justified).

In the description, you call out several examples of who would use these attributes. Do you have example events that are produced by those things we could look at or point to? Until we unblock Events effectively and/or make progress on embed I think this could do a lot for "general" reviewers to be able to evaluate this.

TL;DR; I don't feel capable to approve this PR right now. I suggest one of two options:

  1. Help educate general semconv approvers on this topic, with more links, example instrumentation, etc.
  2. Bring in security experts (e.g. OPA folks, network policy folks, etc.) who can comment on this PR or approve it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try to address some of these, @trisch-me might know more or have more feedback

The usual security use case of these rule fields would be in alert event logs. A security product would run on endpoints monitoring/protecting systems by evaluating sets of rules loaded into the product. When a rule evaluates to true, the endpoint generates an alert event that includes information on what rule triggered the alert. SIEM products can also run detection rules on the data they have already ingested, which is an important use case, but less relevant to open telemetry, since the data is staying within the backend.

Examples of security rules:

The Falco documentation has some easy to understand documentation of what a security rule is: https://falco.org/docs/rules/basic-elements/

I think within security, the differences of rule vs. policy isn't always consistent or clear. Sometimes "rule" is only the detection condition, and sometimes the conditional and response action combined, so more like a OPA policy. Falco rules define the detection condition and response, by defining the priority and output content of the resulting alert. Elastic and Sigma rules only define the detection condition, and the response action is defined separately.

I'm unsure if it's common for OPA rules, or other non-security rules to be published in the same way security rules are, so I don't know if things like version, license, etc apply to them. If they do, I agree this namespace could be made more broad to cover other use cases too.

I could provide a raw alert event with ECS if you want, but it could be hard to understand and might not be too useful. Instead here's a screenshot of the alert UI, which is generated with the ECS event fields:
Screenshot 2024-08-20 at 2 57 13 PM
Screenshot 2024-08-20 at 2 59 51 PM

For the license, the rules are usually published as code, so the license is just the normal code copyright license that applies to it.

@lmolkova
Copy link
Contributor

#1335 and #1034 are proposing to define something similar in the scope of GenAI domain.

It would be awesome to come up with a common proposal and define some GenAI security events (not just attributes) that leverage security rules/policies for this domain. This would also serve as an example for security rule and help move it forward.

/cc @susan-shu-c and @open-telemetry/semconv-security-approvers

Copy link
Contributor

@mjwolf mjwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM for the narrow "security-rule" namespace, it maybe does need more thought into how it can be used in other use cases

component: security-rule

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Introducing a new rule namespace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
note: Introducing a new rule namespace
note: Introducing a new security-rule namespace

@jsuereth
Copy link
Contributor

@trisch-me I think we need to talk about this one in the next semconv meeting. Also - can you update this to move the model file into the new directory structure? (now merged)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:security never stale PRs marked with this label will be never staled and automatically closed
Projects
Status: Needs more approvals
Status: Needs More Approval
Development

Successfully merging this pull request may close these issues.

9 participants