Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Monitoring] Collecting a metric for the age of untriaged testcases #4381

Merged
merged 9 commits into from
Nov 13, 2024

Conversation

vitorguidi
Copy link
Collaborator

@vitorguidi vitorguidi commented Nov 6, 2024

Motivation

Once a testcase is generated (or manually uploaded), followup tasks (analyze/progression) are started. This happens by publishing to a pubsub queue, both for the manually uploaded case, and for the fuzzer generated case.

If for any reason the messages are not processed, the testcase gets stuck. To get better visibility into these stuck testcases, the UNTRIAGED_TESTCASE_AGE metric is introduced, to pinpoint how old these testcases that have not yet been triaged are(more precisely, gone through analyze/regression/impact/progression tasks).

Attention points

Testcase.timestamp mutates in analyze task:

This makes it unreliable as a source of truth for testcase creation time. To circumvent that, a new created field is added to the Testcase entity, from which we can derive the correct creation time.

Since this new field will only apply for testcases created after this PR is merged, Testcase.timestamp will be used instead to calculate the testcase age when the new field is missing.

Testing strategy

Ran the triage cron locally, and verified the codepath for the metric is hit and produces sane output (reference testcase: 4505741036158976).
image

Part of #4271

Copy link
Collaborator

@oliverchang oliverchang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

src/clusterfuzz/_internal/cron/triage.py Show resolved Hide resolved
src/clusterfuzz/_internal/datastore/data_types.py Outdated Show resolved Hide resolved
src/clusterfuzz/_internal/datastore/data_types.py Outdated Show resolved Hide resolved
src/clusterfuzz/_internal/cron/triage.py Outdated Show resolved Hide resolved
@vitorguidi vitorguidi changed the title Collecting a metric for the age of untriaged testcases [Monitoring] Collecting a metric for the age of untriaged testcases Nov 8, 2024
Copy link
Collaborator

@jonathanmetzman jonathanmetzman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@vitorguidi vitorguidi force-pushed the feature/untriaged-testcase-age branch from 5376c83 to 0ed0dee Compare November 8, 2024 17:30
@vitorguidi vitorguidi merged commit ba9009a into master Nov 13, 2024
7 checks passed
@vitorguidi vitorguidi deleted the feature/untriaged-testcase-age branch November 13, 2024 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants