-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Remove early row count and update batch export metrics #22810
Conversation
ffba41d
to
caab583
Compare
posthog/api/app_metrics.py
Outdated
count = fetch_batch_export_run_count( | ||
team_id=run.batch_export.team_id, | ||
data_interval_start=run.data_interval_start, | ||
data_interval_end=run.data_interval_end, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO a metric based on number of events should never exist. We should instead look at count of runs only.
ok, lets instead look at time ranges exported which is easy to do over runs, we can make all the graphs say "exports succeeded" / "exports failed" (because it's the latest run only not all runs) instead of "events sent" / "events failed" in the UI and we can remove this.
), | ||
failures=Sum( | ||
Coalesce("records_total_count", 0), filter=~Q(status=BatchExportRun.Status.COMPLETED), default=0 | ||
runs = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't this get all the runs, not just the latest for that date range? We only want the latest, if something failed, but was retried, then we'd only want to show success in the sparkgraph and metrics (because a user ultimately doesn't care if it was retried as long as it was exported same as webhooks).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if nothing failed but it was retried (manually, by the user) anyways? I think it makes sense to show duplicates if the user requests duplicates. So, we need all runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, no new runs are automatically created in the event of failure, unless manually requested by us or users. Retries are part of a run, not a separate run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 would you mind changing the labels in the UI too with this PR 'events' -> 'runs' in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, latest commit changed this to runs. I'll make the UI change next.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b35875d
to
0c31647
Compare
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
Size Change: 0 B Total Size: 1.06 MB ℹ️ View Unchanged
|
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
Problem
Counting rows is quite expensive. Let's not do it.
Changes
insert_*
batch export activities: If we peek into the record generator and we don't get anything, then we return early.records_total_count
but now queries clickhouse to get a count of rows. This could potentially be very expensive.👉 Stay up-to-date with PostHog coding conventions for a smoother review.
Does this work well for both Cloud and self-hosted?
How did you test this code?
Updated a bunch of tests + added new tests that run workflows without events.