Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test results finisher #234

Merged
merged 8 commits into from
Feb 1, 2024

Conversation

joseph-sentry
Copy link
Contributor

@joseph-sentry joseph-sentry commented Jan 16, 2024

Depends on: #197

Summary

This PR adds functionality for the test results PR comment feature, and contains the necessary improvements to the test result processor task to achieve this.

The first change to the processor is that it's no longer writing to the db anymore. This is because the processor tasks run concurrently, and concurrent writes to the DB were making it difficult to ensure that there multiple Test objects weren't being created at the same time. The writes to the DB to persist Test and TestInstance objects were moved to the finisher.

The second change to the processor is that it now returns a processed list of "Testrun dicts", which contains some metadata (env and run_number) and a list of "testruns" which the finisher will use to decide what Test and TestInstance objects to create.

The TestResultsFinisher task is created in this PR. This task is responsible for gathering the results of the processing step, adding new Test and TestInstance objects, updating existing TestInstance objects that need to be updated, and finally either writing the test results PR comment or triggering the notify task to write the coverage PR comment.

The upload task was updated to use a chord to run the processor tasks concurrently and to pass their results to the finisher task that will run when all the processor tasks are completed.

The notify task now checks if there are any failed tests that exist on a report attached to the commit it is trying to notify for, if there exist failed tests, it will no longer notify.

Edge cases

CI Re-run

This is the edge case where a user might not make any changes to the code, but re-runs the CI. This may happen if there is a flaky test that sometimes fails. We must handle this because when this happens we want to update the outcomes of the tests that ran previously. The run_number metadata field that the processor passes to the finisher is used to handle this edge case. The run_number corresponds to the build_code in the upload. This helps us distinguish between different runs of the same workflow. The run_number tells the finisher that if a TestInstance exists on the current commit and maps to the same Test as the one we are currently examining, we should look at the run_number to determine which one should be used. The notify task being called if there are no failures in the test result finisher, is also part of the solution to this edge case, because of the case where during the re run, the coverage notifier runs before the test result finisher

Same test, different env

This is a case where a user may be running the same test in the same CI run multiple times under a different environment eg. running a tests under multiple python versions. We need to be able to distinguish between these tests. We do this by adding the env field to the Test object which comprises a hash of the flag_names and the job_code of upload where the Test was first created. The uniqueness constraint on Test objects includes the testsuite, name, repoid and env fields which means there can only be one Test with the same name, testsuite and env for each repo.
Subsequent uploads that contain the same test being run with the same env will map to the existing Test object in the DB.

Possible improvements

  • Checkpoint logger integration
  • Addition of metrics on top of the checkpoint logger
  • Streaming parsing improvements mentioned here: feat: add models and task for failed test ingestion #197 (review)
  • Adding a roll up of the test result data for a given CommitReport (similar to ReportLevelTotals)
  • Moving TestInstance data to GCS in sqlite files

@codecov-staging
Copy link

codecov-staging bot commented Jan 16, 2024

Codecov Report

Attention: 9 lines in your changes are missing coverage. Please review.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #234      +/-   ##
==========================================
- Coverage   98.10%   98.10%   -0.01%     
==========================================
  Files         375      377       +2     
  Lines       31013    31428     +415     
==========================================
+ Hits        30425    30832     +407     
- Misses        588      596       +8     
Flag Coverage Δ
integration 98.10% <97.88%> (-0.01%) ⬇️
latest-uploader-overall 98.10% <97.88%> (-0.01%) ⬇️
unit 98.10% <97.88%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 96.15% <92.56%> (-0.03%) ⬇️
OutsideTasks 97.91% <90.62%> (-0.02%) ⬇️
Files Coverage Δ
services/lock_manager.py 91.48% <100.00%> (+2.35%) ⬆️
tasks/tests/unit/test_notify_task.py 100.00% <100.00%> (ø)
tasks/tests/unit/test_test_results_finisher.py 100.00% <100.00%> (ø)
tasks/tests/unit/test_upload_task.py 99.51% <100.00%> (+<0.01%) ⬆️
tasks/upload.py 97.97% <100.00%> (+<0.01%) ⬆️
tasks/notify.py 97.95% <90.00%> (-0.46%) ⬇️
tasks/test_results_finisher.py 95.45% <95.45%> (ø)
services/test_results.py 86.53% <90.47%> (+5.14%) ⬆️

@codecov-qa
Copy link

codecov-qa bot commented Jan 16, 2024

Codecov Report

Attention: 9 lines in your changes are missing coverage. Please review.

Comparison is base (c8afa70) 98.10% compared to head (5d625ed) 98.10%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #234      +/-   ##
==========================================
- Coverage   98.10%   98.10%   -0.01%     
==========================================
  Files         375      377       +2     
  Lines       31013    31428     +415     
==========================================
+ Hits        30425    30832     +407     
- Misses        588      596       +8     
Flag Coverage Δ
integration 98.10% <97.88%> (-0.01%) ⬇️
latest-uploader-overall 98.10% <97.88%> (-0.01%) ⬇️
unit 98.10% <97.88%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 96.15% <92.56%> (-0.03%) ⬇️
OutsideTasks 97.91% <90.62%> (-0.02%) ⬇️
Files Coverage Δ
services/lock_manager.py 91.48% <100.00%> (+2.35%) ⬆️
tasks/tests/unit/test_notify_task.py 100.00% <100.00%> (ø)
tasks/tests/unit/test_test_results_finisher.py 100.00% <100.00%> (ø)
tasks/tests/unit/test_upload_task.py 99.51% <100.00%> (+<0.01%) ⬆️
tasks/upload.py 97.97% <100.00%> (+<0.01%) ⬆️
tasks/notify.py 97.95% <90.00%> (-0.46%) ⬇️
tasks/test_results_finisher.py 95.45% <95.45%> (ø)
services/test_results.py 86.53% <90.47%> (+5.14%) ⬆️

Copy link

codecov-public-qa bot commented Jan 16, 2024

Codecov Report

Merging #234 (5d625ed) into main (c8afa70) will decrease coverage by 0.01%.
The diff coverage is 97.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #234      +/-   ##
==========================================
- Coverage   98.10%   98.10%   -0.01%     
==========================================
  Files         375      377       +2     
  Lines       31013    31428     +415     
==========================================
+ Hits        30425    30832     +407     
- Misses        588      596       +8     
Flag Coverage Δ
integration 98.10% <97.88%> (-0.01%) ⬇️
latest-uploader-overall 98.10% <97.88%> (-0.01%) ⬇️
unit 98.10% <97.88%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 96.15% <92.56%> (-0.03%) ⬇️
OutsideTasks 97.91% <90.62%> (-0.02%) ⬇️
Files Coverage Δ
services/lock_manager.py 91.48% <100.00%> (+2.35%) ⬆️
tasks/tests/unit/test_notify_task.py 100.00% <100.00%> (ø)
tasks/tests/unit/test_test_results_finisher.py 100.00% <100.00%> (ø)
tasks/tests/unit/test_upload_task.py 99.51% <100.00%> (+<0.01%) ⬆️
tasks/upload.py 97.97% <100.00%> (+<0.01%) ⬆️
tasks/notify.py 97.95% <90.00%> (-0.46%) ⬇️
tasks/test_results_finisher.py 95.45% <95.45%> (ø)
services/test_results.py 86.53% <90.47%> (+5.14%) ⬆️

Copy link

codecov bot commented Jan 16, 2024

Codecov Report

Attention: 9 lines in your changes are missing coverage. Please review.

Comparison is base (c8afa70) 98.07% compared to head (5d625ed) 98.07%.

Changes have been made to critical files, which contain lines commonly executed in production. Learn more

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #234      +/-   ##
==========================================
- Coverage   98.07%   98.07%   -0.01%     
==========================================
  Files         406      408       +2     
  Lines       31714    32129     +415     
==========================================
+ Hits        31105    31512     +407     
- Misses        609      617       +8     
Flag Coverage Δ
integration 98.10% <97.88%> (-0.01%) ⬇️
latest-uploader-overall 98.10% <97.88%> (-0.01%) ⬇️
onlysomelabels 98.07% <97.88%> (-0.01%) ⬇️
unit 98.10% <97.88%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 96.06% <92.56%> (-0.03%) ⬇️
OutsideTasks 97.91% <90.62%> (-0.02%) ⬇️
Files Coverage Δ
services/lock_manager.py 91.48% <100.00%> (+2.35%) ⬆️
tasks/tests/unit/test_notify_task.py 100.00% <100.00%> (ø)
tasks/tests/unit/test_test_results_finisher.py 100.00% <100.00%> (ø)
tasks/tests/unit/test_upload_task.py 99.51% <100.00%> (+<0.01%) ⬆️
tasks/upload.py Critical 97.98% <100.00%> (+<0.01%) ⬆️
tasks/notify.py Critical 97.95% <90.00%> (-0.46%) ⬇️
tasks/test_results_finisher.py 95.45% <95.45%> (ø)
services/test_results.py 86.53% <90.47%> (+5.14%) ⬆️
Related Entrypoints
run/app.tasks.notify.Notify

@joseph-sentry joseph-sentry force-pushed the joseph/add-test-results-finisher-to-upload branch from 6facf1f to 3563af6 Compare January 16, 2024 19:01
@matt-codecov
Copy link
Contributor

joseph/failed-test-ingestion...joseph/add-test-results-finisher-to-upload this PR includes the test result processor PR, here is a URL to compare the two branches separately

Copy link
Contributor

@matt-codecov matt-codecov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

started a group DM about a design tweak for performance reasons. need to sync on that before accept/request changes decision

len_of_table_value = len(table_value)
for i in range((len_of_table_value) // 70):
table_value = (
table_value[: (i + 1) * 70] + "<br>" + table_value[(i + 1) * 70 :]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mentioned that the "<br>" takes up 4 characters which messes with the line lengths. you can avoid that with something like:

line_size = 70
lines = [table_value[i:i+line_size] for i in range(0, len(table_value), line_size)]
return "<br>".join(lines)

return current_report_row

# support flags in test results
def create_report_upload(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we put the flags stuff in the base class and not have to override this here?

Comment on lines +51 to +59
with lock_manager.locked(
LockType.NOTIFICATION,
retry_num=self.request.retries,
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this lock need to surround the whole finisher task or just the part where we work on the PR comment?

)
assert commit, "commit not found"

notify = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing ever sets this to false, it's only read in one place where we make the return dict. you could just inline True there

upload_id = Column(types.Integer, ForeignKey("reports_upload.id"))
upload = relationship("Upload", backref=backref("testinstances"))
failure_message = Column(types.Text)
active = Column(types.Boolean)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we order by created time or join the uploads table and order by upload time to get the newest? that way we just need a single atomic INSERT to insert a new test instance, we don't need to also run an UPDATE to set the previous one to false that we have to make sure runs in the same transaction to avoid races

repository = relationship("Repository", backref=backref("tests"))
name = Column(types.String(256), nullable=False)
testsuite = Column(types.String(256), nullable=False)
env = Column(types.String(256), nullable=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i just left a comment on #197 (review) about changing the id field of this table to be a hash instead. in my comment over there i said it should be a hash of the test suite + test name, but i suppose it should also include the inputs to env as well since env is a similar concept

tasks/notify.py Outdated
Comment on lines 140 to 157

# check if there were any test failures

num_of_failed_test_instances = (
db_session.query(TestInstance)
.join(Upload)
.join(CommitReport)
.filter(
CommitReport.commit_id == commit.id_,
TestInstance.outcome == int(Outcome.Failure),
TestInstance.active is True,
)
.count()
)

if num_of_failed_test_instances > 0:
return {
"notify_attempted": False,
"notifications": None,
"reason": "test_failures",
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be in should_send_notifications()?

def should_send_notifications(self, current_yaml, commit, ci_passed, report):

@joseph-sentry joseph-sentry force-pushed the joseph/add-test-results-finisher-to-upload branch from b026451 to 31dba51 Compare January 17, 2024 22:11
@joseph-sentry joseph-sentry force-pushed the joseph/add-test-results-finisher-to-upload branch from 8742bbe to a390c59 Compare January 24, 2024 14:20
@joseph-sentry joseph-sentry force-pushed the joseph/add-test-results-finisher-to-upload branch 2 times, most recently from 731ea75 to 7d9aa9a Compare January 25, 2024 19:51
Copy link
Contributor

@matt-codecov matt-codecov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for all the iteration on these changes, i think we will long-term be happy we took the time. just nits left, some old comments + a couple new ones

tasks/notify.py Outdated Show resolved Hide resolved
tasks/notify.py Outdated
)

if any(
[test.outcome == int(Outcome.Failure) for test in latest_test_instances]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think test.outcome is a string now, is the int cast correct here?

Comment on lines 102 to 109
.join(Upload)
.join(CommitReport)
.join(Commit)
.filter(Commit.id_ == commit.id_)
.order_by(TestInstance.test_id)
.order_by(desc(Upload.created_at))
.distinct(TestInstance.test_id)
.all()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider moving this query to a helper function on TestInstance or something since you use it in more than one place so that the logic definitely stays the same

)
return {"notify_attempted": False, "notify_succeeded": False}

success = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this here if it's overwritten two lines down?


def check_if_no_failures(self, testrun_list):
return all(
[instance.outcome != int(Outcome.Failure) for instance in testrun_list]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same Q re instance.outcome being a string and int() maybe not being correct

This class creates the PR comment message based on the test instances
passed to it and posts/edits the comment on the PR.

Signed-off-by: joseph-sentry <[email protected]>
The notify task will exit early now, if the latest test instances
relevant to the notify's commit contain failures

Signed-off-by: joseph-sentry <[email protected]>
This task will check if failures exist in the latest round
of test instances for this commit. If they do it will run
the test results notifier, else it will trigger the notify task
to run.

This task takes the same lock as the notify task so that they
don't try writing to the PR comment at the same time.

Signed-off-by: joseph-sentry <[email protected]>
The test result finisher runs after the test results processors
are done running. This is achieved using a celery chord.

Signed-off-by: joseph-sentry <[email protected]>
this commit changes the int conversions of Outcome to
str conversions.

moves the db query for latest test instances for a given commit
to a separate function

and cleans up some unused vars

Signed-off-by: joseph-sentry <[email protected]>
@joseph-sentry joseph-sentry force-pushed the joseph/add-test-results-finisher-to-upload branch from efa626e to 5d625ed Compare February 1, 2024 20:19
@joseph-sentry joseph-sentry merged commit 0b7ef43 into main Feb 1, 2024
15 of 31 checks passed
@joseph-sentry joseph-sentry deleted the joseph/add-test-results-finisher-to-upload branch February 1, 2024 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants