-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for field with multiple names / hard link aliases #109954
Comments
Another idea:
This has the advantage that the field types don't need to be changed (which isn't possible for an existing index). The aliases section could behave similar to the While we would expect that only one of the fields is present, there's the edge case where both |
Pinging @elastic/es-search (Team:Search) |
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
@felixbarny For the use case we discussed, I'm not sure about one semantic detail. By the existing otel aliasing logic, In your API example from above, how would this look like? I could imagine the following:
Then, when I search
Does this make sense? |
Given some of the recent changes in OTel regarding body vs attributes for events (open-telemetry/semantic-conventions#1651), I think we should not map |
@felixbarny would we still have the same problem with |
Yeah, you're right. |
With the move to ECS and semconv, what fields to use has become more and more standardised and users are migrating to it. Part of the migration is using field aliases to point to the ECS field so the data shipper does not have be adjusted but queries against the data can be written in a standardised way. The challenge is, multiple shippers send data to the same data stream with different names for the same field.
Lets go through an example where
host.name
exists in 3 fields:host.name
: The recommended ECS field, used for querying and eventually shipping datahost_name
: The field that is used by some shippersresource.attributes.host.name
: Field that comes from some otel dataThe user wants to write queries against
host.name
. At first, an alias is created to pointhost.name
tohost_name
:Now the user can query on
host_name
andhost.name
. Unfortunately theresource.host.name
is not included yet because an array for aliases is not supported:But having an array for aliases is not enough. Eventually some of the shippers migrate to use
host.name
which means data is sent to the alias itself:This leads to the error
"reason": "[2:16] Cannot write to a field alias [host.name]."
. The ideal scenario would be that a single field could have multiple names or compared to the linux file system, hard links can be created. It is possible to query and ingest into all field names and the field exists until the last reference is removed.Having support for hard links would simplify the migration to ECS / semconv for users.
Doing the standardisation in an ingest pipeline is a not a solution as also the old fields still have to be queried. One solution that sometimes is used to work around the limitation is duplicate the data into each field, but that is not simple and has a negative impact on storage.
Implementation ideas
Two ideas below on how this could look like, but I'm sure better solutions can be found.
Idea 1:
Idea 2:
Related links / discussions
The text was updated successfully, but these errors were encountered: