Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lens] hide missing fields errors until the data view has some fields #151526

Open
drewdaemon opened this issue Feb 16, 2023 · 18 comments
Open

[Lens] hide missing fields errors until the data view has some fields #151526

drewdaemon opened this issue Feb 16, 2023 · 18 comments
Assignees
Labels
enhancement New value added to drive a business result Feature:Lens impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. loe:large Large Level of Effort Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@drewdaemon
Copy link
Contributor

drewdaemon commented Feb 16, 2023

Describe the feature:
When there is no data in a data view, we show a slew of missing field errors. Technically, this is correct. Since they don't have data, no fields are present, so they're all missing.

However, in this case we can provide more useful messaging that doesn't look as severe. We can be pretty sure that this is more of a no-data case, than a broken visualization.

Describe a specific use case for the feature:
In #143673 we stopped blocking visualization render when there are missing fields. This improved the pre-ingest integration dashboard look by making the missing field errors less in-your-face.

However, we are still showing the user errors when they haven't done anything wrong. They're just waiting for data.

cc: @ruflin @MichaelMarcialis

@drewdaemon drewdaemon added loe:small Small Level of Effort Team:Visualizations Visualization editors, elastic-charts and infrastructure Feature:Lens impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. labels Feb 16, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-visualizations @elastic/kibana-visualizations-external (Team:Visualizations)

@drewdaemon
Copy link
Contributor Author

Relates to #149836

@stratoula stratoula added the enhancement New value added to drive a business result label Feb 20, 2023
@ruflin
Copy link
Member

ruflin commented Feb 20, 2023

I really like that we know which of the fields are missing. This is not only helpful to users of the visualisations but especially to the ones building the visualisations. It would be nice if we could show it much more subtle somehow.

@drewdaemon
Copy link
Contributor Author

drewdaemon commented Feb 20, 2023

@ruflin thank you for the feedback.

This is not only helpful to users of the visualisations but especially to the ones building the visualisations.

Do the visualization authors ever build their visualizations without first ingesting some data? If not, this change shouldn't impact their experience.

This issue is about hiding the missing fields list before ingest (i.e. when a user has just installed integration assets but is waiting on data). Before ingest, all fields are counted missing through no fault of the user, so we're considering treating it as a no-data state.

As soon as the first document is indexed, we would show the missing field errors (if there are any) as we do today.

WDYT?

@ruflin
Copy link
Member

ruflin commented Feb 20, 2023

Do the visualization authors ever build their visualizations without first ingesting some data? If not, this change shouldn't impact their experience.

I'm more thinking about the debugging use case. Vis built, exported, someone else continues to work on it and for some reason, it doesn't show results. Maybe ingestion field changed, ingest pipeline was modified and had unexpected side effects, ...

As soon as the first document is indexed, we would show the missing field errors (if there are any) as we do today.

There are quite a few dataset where certain fields only show up when it happened for the first time. Lets take nginx logs as example. As long as no request came in containing a source.address, the field will not be populated. Or in the Elasticsearch case, as long as there is no exception happening in the log file, I think a few less fields are populated. I guess the same is true for Kibana logs. And this can also happen in the metric case as some metrics are not populated at first (or never if something never happens).

I hope this answers the question indirectly.

@drewdaemon
Copy link
Contributor Author

Vis built, exported, someone else continues to work on it and for some reason, it doesn't show results. Maybe ingestion field changed, ingest pipeline was modified and had unexpected side effects, ...

I think this might still be covered here because the suggested behavior is

  • if there are no fields, show a non-error message saying there is no data
  • if there are some fields, but some are missing, show errors for the missing fields like we do now

So in either case, the viewer would be warned that something is wrong/changed.

There are quite a few dataset where certain fields only show up when it happened for the first time.

Hmmm, this is an important point. You're saying that not all fields are necessarily included in the first document that is indexed, right?

I think the key question is: does the field caps API report these fields even though none of the documents contain values for them?

If yes — the data view will still contain these fields so Lens will not detect them as missing ✅.
If no — the data view will not, and missing field errors would start showing up as soon as the first document is indexed, even though having those fields be missing isn't necessarily a problem which, as you're saying, calls into question the "error" classification.

Since the indexes are created from index templates which contain all the fields in their mappings, I'm thinking the answer is "yes." But, I should validate this.

@ruflin
Copy link
Member

ruflin commented Feb 20, 2023

Hmmm, this is an important point. You're saying that not all fields are necessarily included in the first document that is indexed, right?

Correct.

I think the key question is: does the field caps API report these fields even though none of the documents contain values for them?

The answer is currently yes. But it is something I have been challenging for some time and especially for ECS fields (where we load too many) we move to dynamic templates to not have this happening. Ideally the field caps API would tell you what fields are actually used (have content) and what is only template (unfortunately in our case the mapping too).

@drewdaemon
Copy link
Contributor Author

I think the key question is: does the field caps API report these fields even though none of the documents contain values for them?

The answer is currently yes. But it is something I have been challenging for some time and especially for ECS fields (where we load too many) we move to dynamic templates to not have this happening.

So, it seems to me like the change described in this issue does improve the current situation (?), but may be disrupted by this change. Do you have an idea on timeline?

Ideally the field caps API would tell you what fields are actually used (have content) and what is only template (unfortunately in our case the mapping too).

I think this would be nice because it would allow us to discriminate between "hard" missing and "soft" missing cases in the UI. This change would certainly mean (modest?) changes for the data view system (cc: @mattkime ).

@mattkime
Copy link
Contributor

I think the key question is: does the field caps API report these fields even though none of the documents contain values for them?

Yes, there are two similar questions - "What are all the fields?" and "What are all the fields for a given filter?" Kibana as a whole could be a bit more sophisticated as to how it treats these two cases although some apps (lens and discover, I think) do a pretty good job. IMO fields with data in the current context should be prominent but all fields should be available in some manner.

@drewdaemon Perhaps we should talk to someone on the ES team who knows how field caps determines if a given field has values. I know this functionality is there but I'd like to understand it in better detail.

@ruflin
Copy link
Member

ruflin commented Feb 22, 2023

I think is is related / overlaps with #24709

@flash1293
Copy link
Contributor

My take on this:

The way we set up the "managed" data collection (via integrations/fleet/elastic-agent), this case is common and will get more common, which makes this issue increasingly more important. Fields will not show up in the mapping until data got actually sent by the shippers, but it's not a binary question - each field can show up or not show up at any point in time (depending on the configuration of the shipper, some fields might never show up because certain parts of the data aren't collected).

As fields are defined in a very dynamic manner by their suffix (e.g. *.ip is mapped as ip), it's not very meaningful to look for "hypothetical" fields that could exist in the mapping given the right data, as almost everything could and it's not a guarantee we will ever see data for it.

On the flip side, the case where a field missing in the mapping is a serious issue that needs admin attention might be true in some situations, but is not how most setups are configured.

Things we could do:

  • Make this a special case for managed dashboards - in case of managed dashboards, let the integration author decide how they want to treat missing field errors (e.g. hide them)
  • Make this a special case for the logs-* and metrics-* data streams as the mappings in these are set up in this dynamic manner by default
  • Make this a special case by the dynamic configuration of the indices we are hitting - if new fields can be added by incoming data, treat missing fields as missing data (no results), if the mapping is set to strict, treat missing fields as config error (current behavior)
  • Make this the default for all cases (like some competitors do)

What do you think, @markov00 @ruflin ?

@ruflin
Copy link
Member

ruflin commented Mar 28, 2024

I like option 1 and 2 with preference on 1. For o11y and security use case, I don't see a problem with 4 but it's a breaking change I don't think it is worth to go through it. Potentially 4 could be done on a project level. I think of it as a hierarchy of settings: Project < Dashboard < Visualisation. Dashboard seems to be a good middleground to start.

@flash1293
Copy link
Contributor

An advantage of 2 is that it avoids another bit of explicit configuration that needs to be managed.

@dej611
Copy link
Contributor

dej611 commented Mar 28, 2024

I like option 3 better vs others, but it might take a bit to get it up and running.
Maybe option 2 can be a quick win, with the option to extend it to option 3 in the medium/longer term.

@markov00
Copy link
Member

I agree with @dej611 option 3 can probably catch the issue at the source (dynamic mappings mean that some fields can be missing), but 2 is a quick with without adding more complexity to the lens/dashboard configuration.

@markov00 markov00 self-assigned this May 28, 2024
markov00 added a commit that referenced this issue Jun 5, 2024
This PR cleaned up a bit our user messages. In particular:
- marks as required the `uniqueId` for a `UserMessage` making it
uniquely identifiable across various message renderers (moves toward
this objective: #151526)
- adds a unique ID for each UserMessage in Lens
- partially unifies the Error structure between UserMessages and
form-based validation.
- Opens the door to provide metadata about the errors, that are
currently buried within the text/react node (moves toward this
objective: #151526)

Subsequent possible steps, outside this PR:
- add metadata where required for
#151526
- merge even more the type of `UserMessage`,
`FieldBasedOperationErrorMessage` and `ValidationErrors`
- centralize the messages id and translations
- add a Type for the `uniqueId` to guide future developments
@markov00
Copy link
Member

markov00 commented Oct 3, 2024

Speaking with @teresaalvarezsoler we also highlighted that this problem could be even worse when we need to deal with ES|QL driven panels. The messages are completely different and we need to find a general way of handling this behavior the same way independently from the query language used.
Image

@markov00 markov00 added loe:large Large Level of Effort and removed loe:small Small Level of Effort labels Oct 3, 2024
@markov00
Copy link
Member

Even if we can fix that by improving the way UserMessages are created and displayed, even if we override the errors with better messaging, we are still not solving the root problem that seems to be: when an integration asset is installed the index mappings are not updated to contain the required fields, but we rely on dynamic mappings.
If instead, we add update mappings we should probably fix this problem once and for all. Am I missing something here?

@drewdaemon
Copy link
Contributor Author

drewdaemon commented Oct 30, 2024

Someone correct me if I'm missing the mark, but perhaps I can provide/restate some context since I created the issue and spent time talking to the integrations folks.

On "fixing" the issue on the integrations side

I think the crux is that Kibana has often treated missing fields and indices as unintended problems to be fixed, whereas many folks in the solution teams (integrations, and others), Fleet, and Elasticsearch see them as generally expected (something to be planned for, not a problem to be fixed).

In my opinion, this misalignment of expectations is at the heart of this issue and several others like it.

The belief by non-Kibana folks that missing fields and indices are ok and expected has led to many scenarios where fields and indices can be missing. Several relate to the way integration data is managed, but I have seen this crop up outside of integrations as well.

Generally, Kibana (platform team) has made these scenarios look like something is broken. But again, many folks don't see these scenarios as errors, but rather something to be planned for.

I believe this is what leads them to advocate for "less scary" messages. I don't think their expectation is that Kibana pull data out of thin air to render visualizations that can't exist. It's rather to stop scaring users with big red icons and toast storms, and dire warnings when fields and indices are missing (someone correct me if I have misunderstood 😆 )

As far as integration data goes—

Marco last time I checked they didn't technically use dynamic mapping. It's something that has been discussed and advocated, but I don't think this is where the missing fields come from. But, it would be a similar scenario if they did.

AFAIK, the integrations have the following strategy when it comes to data management.

  • Installing an integration is separate from ingesting integration data
  • Installing an integration installs some set of index templates that match up with data that could possibly be ingested, as well as dashboards and visualizations built on index patterns that match a set of (as yet) non-existent indices.
  • Ingesting data from the datasource "activates" the index templates that correspond to the data, creating indices. However, some index templates may never be activated, or may take an indefinite amount of time to be activated. For example
    • An AWS integration may have index templates for DynamoDB and billing data. But if the AWS account doesn't use DynamoDB, the corresponding index template will never be activated because no DynamoDB data will be ingested.
    • A security integration may not contain data until some security incident happens, which may or may not ever happen.

Fields are only reported by the field_caps API when they exist in indices matching the index pattern. This is where the missing integration fields come in

  • The integration has been installed (dashboards, etc) but no data have been ingested for any template that match the index pattern so no indices will be present. This is generally a temporary state, but, again, not necessarily temporary. In this case, no fields will be returned from field_caps. This is the scenario I was originally trying to address with this issue. (not saying it is still valid)
  • Data have been ingested for an integration, but the configuration of the data source is such that certain index templates haven't been or will never be activated. So, fields that certain visualizations depend on are missing.

I am not the expert on why this was set up this way. But I have assumed that there is some technical cost to creating fields that don't need to exist. I have also been told that customers simply disable certain metrics that are not interesting to them to manage their own costs (as in money).

ES|QL

I believe the integration folks would ask the same here as they do with other missing data scenarios: don't make the messaging scary. Informing the user is ok, but don't give the impression that something is necessarily broken just because the query reported a missing field (they don't have the data to render a visualization).

Managed content

One thing that has changed since I opened this issue is that Fleet now marks all integration assets with managed: true (root level saved object property added in #154515). This opens the door on our end if we wanted to compromise and give one behavior for integration assets, and another for user-created.

Sorry for the novel, but I thought this was as good a place as any to get my thoughts down 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:Lens impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. loe:large Large Level of Effort Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants