-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle Test/Flag inserts in the DB #921
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
✅ All tests successful. No failed tests found. Additional details and impacted files@@ Coverage Diff @@
## main #921 +/- ##
==========================================
- Coverage 97.98% 97.98% -0.01%
==========================================
Files 446 446
Lines 35656 35652 -4
==========================================
- Hits 34936 34932 -4
Misses 720 720
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Codecov ReportAll modified and coverable lines are covered by tests ✅ ✅ All tests successful. No failed tests found. 📢 Thoughts on this report? Let us know! |
✅ All tests successful. No failed tests were found. 📣 Thoughts on this report? Let Codecov know! | Powered by Codecov |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm cool with getting this in and measuring how it improves the perf, i do think the actual long term solution is the one that you mentioned previously: instead of writing to the db in each processor task we should be writing some intermediate result (to redis?) and merging them in the finisher and writing to the db there
a9b1b7f
to
112a639
Compare
This eagerly inserts all the test / flags combinations, and uses a `on_conflict_do_nothing` to avoid duplicates. Previously, this would resolve/fetch *all* the tests that have *any* repo flag, and deduplicate those on the python side. That code was very slow, as the underlying query was extremely slow for unknown reasons. I also believe the code was potentially buggy, as it queried for *all* the tests with repo flags, not filtering for the specific flags we were interested in. Though it was probably fine, as the flags are part of the deterministic `test_id`. In addition, this queries for the `flag_id` directly, turning it into a `set`, instead of using a `map` and a lookup for each test.
112a639
to
80d83cc
Compare
This eagerly inserts all the test / flags combinations, and uses a
on_conflict_do_nothing
to avoid duplicates.Previously, this would resolve/fetch all the tests that have any repo flag, and deduplicate those on the python side.
That code was very slow, as the underlying query was extremely slow for unknown reasons. I also believe the code was potentially buggy, as it queried for all the tests with repo flags, not filtering for the specific flags we were interested in. Though it was probably fine, as the flags are part of the deterministic
test_id
.In addition, this queries for the
flag_id
directly, turning it into aset
, instead of using amap
and a lookup for each test.Fun fact, this function accounts for ~40% of all time in profiling: