Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[434] Allow to filter jobs in ZyteJobsComparisonMonitor by close_reason #440

Merged
merged 13 commits into from
Jul 15, 2024
16 changes: 15 additions & 1 deletion spidermon/contrib/scrapy/monitors/monitors.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
SPIDERMON_JOBS_COMPARISON = "SPIDERMON_JOBS_COMPARISON"
SPIDERMON_JOBS_COMPARISON_STATES = "SPIDERMON_JOBS_COMPARISON_STATES"
SPIDERMON_JOBS_COMPARISON_TAGS = "SPIDERMON_JOBS_COMPARISON_TAGS"
SPIDERMON_JOBS_COMPARISON_CLOSE_REASONS = "SPIDERMON_JOBS_COMPARISON_CLOSE_REASONS"
SPIDERMON_JOBS_COMPARISON_THRESHOLD = "SPIDERMON_JOBS_COMPARISON_THRESHOLD"
SPIDERMON_ITEM_COUNT_INCREASE = "SPIDERMON_ITEM_COUNT_INCREASE"

Expand Down Expand Up @@ -523,6 +524,10 @@
You can also filter which jobs to compare based on their tags using the
``SPIDERMON_JOBS_COMPARISON_TAGS`` setting. Among the defined tags we consider only those
that are also present in the current job.

You can also filter which jobs to compare based on their close reason using the
``SPIDERMON_JOBS_COMPARISON_CLOSE_REASONS`` setting. The default value is ``()``,
which doesn't filter any job based on close_reason.
shafiq-muhammad marked this conversation as resolved.
Show resolved Hide resolved
"""

stat_name = "item_scraped_count"
Expand Down Expand Up @@ -551,6 +556,9 @@

def _get_jobs(self, states, number_of_jobs):
tags = self._get_tags_to_filter()
close_reason = self.crawler.settings.getlist(
shafiq-muhammad marked this conversation as resolved.
Show resolved Hide resolved
SPIDERMON_JOBS_COMPARISON_CLOSE_REASONS, ()
)

jobs = []
start = 0
Expand All @@ -563,7 +571,13 @@
filters=dict(has_tag=tags) if tags else None,
)
while _jobs:
jobs.extend(_jobs)
if close_reason:
for job in _jobs:
if job.get("close_reason") in close_reason:
jobs.append(job)

Check warning on line 577 in spidermon/contrib/scrapy/monitors/monitors.py

View check run for this annotation

Codecov / codecov/patch

spidermon/contrib/scrapy/monitors/monitors.py#L577

Added line #L577 was not covered by tests
else:
jobs.extend(_jobs)
shafiq-muhammad marked this conversation as resolved.
Show resolved Hide resolved

start += 1000
_jobs = client.spider.jobs.list(
start=start,
Expand Down
Loading