Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEN-1166 - Improve Ingestion Workflow Error Summary #18280

Merged
merged 4 commits into from
Oct 16, 2024

Conversation

pmbrull
Copy link
Collaborator

@pmbrull pmbrull commented Oct 15, 2024

Describe your changes:

Adding success % to the status list

[2024-10-15 15:28:50] INFO     {metadata.Utils:logger:181} - Workflow Mysql Summary:
[2024-10-15 15:28:50] INFO     {metadata.Utils:logger:181} - Processed records: 91
[2024-10-15 15:28:50] INFO     {metadata.Utils:logger:181} - Updated records: 0
[2024-10-15 15:28:50] INFO     {metadata.Utils:logger:181} - Warnings: 0
[2024-10-15 15:28:50] INFO     {metadata.Utils:logger:181} - Errors: 0
[2024-10-15 15:28:50] INFO     {metadata.Utils:logger:181} - Success %: 100.0
[2024-10-15 15:28:50] INFO     {metadata.Utils:logger:181} - Workflow OpenMetadata Summary:
[2024-10-15 15:28:50] INFO     {metadata.Utils:logger:181} - Processed records: 64
[2024-10-15 15:28:50] INFO     {metadata.Utils:logger:181} - Updated records: 0
[2024-10-15 15:28:50] INFO     {metadata.Utils:logger:181} - Warnings: 0
[2024-10-15 15:28:50] INFO     {metadata.Utils:logger:181} - Errors: 27
[2024-10-15 15:28:50] INFO     {metadata.Utils:logger:181} - Success %: 70.33

and unifying how we compute success %.

Also, final % is based as the mean of the success of all steps. Before, we only gave "wiggle room" for failures on the source, but not on any other step. Now, the SUCCESS_THRESHOLD is applied to all steps, and final success % is based on all the steps.

Finally, we are allowing to configure the success % in the workflowConfig instead of having it as a constant in the code.

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

@pmbrull pmbrull requested a review from a team as a code owner October 15, 2024 13:59
Copy link

### Python SDK
The `metadata insight` command has been removed. Since Data Insights application was moved to be an internal system application instead of relying on external pipelines the SDK command to run the pipeline was removed.
Now, we're changing this behavior to consider the success rate of all the steps involved in the workflow. The UI will
then show more `Partial Success` statuses rather than `Failed`, properly reflecting the real state of the workflow.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont understand how this impacts me as a user. Will I have more failing ingestions? Less? We can take an example ingestion and show how it would reflect differently now compared to before this change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mhmhmh it was the goal of this phrasing

The UI will then show more `Partial Success` statuses rather than `Failed`

-> less Failed, more Partial Success.

Basically wanted to give a heads up that if they had a pipeline that before showed as failed and now it showed something different, it was expected 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved, as a side quest I suggest we clarify definition of "partial success" or do away with it. From my experience ETL processes have a status of success / failure so the user knows if they need to inverine or not. I have suspicions that "partial success" can confuse platform users of whether they need to do something about their pipelines.

Examples of this design can be seen in:


self._print_failures_if_apply(failures)

total_success = max(total_records, 1)
# If nothing is processed, we'll have a success of 100%
success_pct = mean([step.get_status().calculate_success() for step in steps])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will fail with this error if steps is empty.

statistics.StatisticsError: mean requires at least one data point

Not sure if we currently have such a use-case but worth handling this just in case and ask the users to check config or send them to slack.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not happen since we pass the steps by hand but fair enough, will handle that

@pmbrull pmbrull requested a review from sushi30 October 16, 2024 14:08
Copy link

sonarcloud bot commented Oct 16, 2024

@pmbrull pmbrull merged commit 7012e73 into open-metadata:main Oct 16, 2024
24 of 26 checks passed
harshach pushed a commit that referenced this pull request Oct 17, 2024
* GEN-1166 - Improve Ingestion Workflow Error Summary

* fix test

* docs

* comments
harshach pushed a commit that referenced this pull request Oct 19, 2024
* GEN-1166 - Improve Ingestion Workflow Error Summary

* fix test

* docs

* comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ingestion safe to test Add this label to run secure Github workflows on PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants