-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TP2000 360 - Refactor automated business rule checks. #633
base: master
Are you sure you want to change the base?
Conversation
04f5a75
to
c2c9021
Compare
2a77ba1
to
004d26b
Compare
checks/checks.py
Outdated
"""Runs Checker-dependent logic and returns an indication of success.""" | ||
raise NotImplementedError() | ||
@classmethod | ||
def apply_rule( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On first glanceapply_rule
doesn't seem so different to run_rule()
. A bit more docstring to distinguish between them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the distinction confusing myself - this may need more thought, or at least more docs.
def apply_rule( | ||
cls, | ||
rule: BusinessRule, | ||
transaction: Transaction, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't clear why transaction is required and how it's used - docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have atomisity, but we can try and get closer to it: one way is by passing in the current transaction when you check.
This may also allow us to do things like check the system as it was in the past.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add that explanation into the docstring?
from typing import Collection | ||
from typing import Dict | ||
from typing import Iterator | ||
import logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be right to think of this module as a main entry point for checks (besides the scheduling)?
A good, high-level description of how checks work in a module-level docstring would be good.
checks/checks.py
Outdated
cls._checker_cache[checker_name] = BusinessRuleCheckerOf | ||
return BusinessRuleCheckerOf | ||
class LinkedModelsBusinessRuleChecker(Checker): | ||
"""Apply BusinessRules specified in a TrackedModels indirect_business_rules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a good description of what indirect business rules are in our source code (how they differ to regular business rules, for instance). If you have a good one in mind from working in this area, then it would be useful to add it against common.models.trackedmodel.TrackedModel.indirect_business_rules
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make this closer to the original text in that case -
Lines 167 to 170 in 25debc2
""" | |
A ``Checker`` that runs a ``BusinessRule`` against a model that is linked to | |
the model being checked, and for which a change in the checked model could | |
result in a business rule failure against the linked model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And see if I can find examples :)
CELERY_RESULT_EXTENDED = True # Adds Task name, args, kwargs to results. | ||
|
||
# The following settings are usually useful for development, but not for production. | ||
CELERY_TASK_ALWAYS_EAGER = is_truthy(os.environ.get("CELERY_TASK_ALWAYS_EAGER", "N")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We seem to use actual bools as the default param to is_truthy(os.environ.get()) combos. Worth sticking with that convention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - this was actually a source of bugs, and I previously commited a change to master so is_truthy
to returns False
if the value passed to it is not truthy.
This is to solve a bug where I had a line like:
CELERY_TASK_ALWAYS_EAGER = is_truthy(os.environ.get("CELERY_TASK_ALWAYS_EAGER"))
This defaulted to True... because:
is_truthy(None)
was called:
Which then has the line:
return str(value).lower() not in ("n", "no", "off", "f", "false", "0")
Since str(value), was "None", it returned True.
392667e
to
fca35a1
Compare
…o celery along with info learned deploying this. Avoid filling the task queue with orchestration tasks and starving the workers. =============================================================================== In the previous system there were about 3 layers of tasks, that orchestrated other tasks, by using the .replace() API in each task. Unfortunately it was possible for celery workers to become full of orchestration tasks leaving no room for the business rule tasks at the bottom of the to actually run. This PR attempts two mitigations: 1. Use celery workflows instead of .replace() This PR builds a celery workflow in the check_workbasket using celery constructs such as chain and group. In theory, since most of the work is done ahead of time the system should have more awareness of the task structure avoiding the issue of starvation. 2. Cancel existing workbasket checks when a new check is requested. When check_workbasket is started, it will attempt to revoke existing check_workbasket tasks for the same workbasket. Treat intermediate data structures as ephemeral =============================================== A celery task may execute at any time, right now - or when a system comes up tomorrow, based on this assumption models such as TrackedModelCheck (which stores the result of a business rule check on a TrackedModel) are no longer passed to celery tasks by ID, instead all the information needed to receate the data is passed to the celery task, this means the system will still work even if developers delete these while it is running. Reduce layers in business rule checking ======================================= BusinessRuleChecker and LinkedModelsBusinessRuleChecker are now the only checkers, these now take BusinessRule instances, instead of being subclassed for each business rule. While more parameters are passed when rules are checked a conceptual layer has been removed and the simplification is reflected with around 20 lines of code being removed from checks.py Celery flower is now very easier to read ======================================== Due to the changes above, the output in celery flower should correspond more closely to a users intentions - ids of models. Content Checksums ================= Result caching now validates using checksums of the content, which should reduce the amount of checking the system needs to do. When a workbasket has been published, it's content could invalidate some content in other unpublished workbaskets, by associating business rule checks with checksums of a models content, any models that do not clash can be skipped. Model checksums (generated by `.content_hash()`) are not currently stored in the database (though it may be desirable to store them on TrackedModels, as it would provide an mechanism to address any content in the system). The checksuming scheme is a combination of the type and a sha256 of the fields in `.copyable_fields` (which should represent the fields a user can edit, but not fields such as pk). Blake3 was tested, as it provides a fast hashing algorithm, in practice it didn't provide much of a speedup over sha256. PK ranges ========= Occasionally workbaskets with many items may need to be checker (the initial workbasket has 9 million items). Based on the observations that the ID column of the contained TrackedModels is mostly continguous, the system allows passing sequences of contiguous TrackedModels specified by tuples of (first_pk, last_pk). This is relatively compact, suitable for passing over the network with celery and readable in Celery flower. This also enables chunking of tasks - further enabled by specifying a maximum amount of items in each tuple. On TrackedModelQueryset `.as_pk_intervals` and `.from_pk_intervals` are provided to go to and from this format.
c0f5168
to
3106ac4
Compare
Migrate TrackedModelChecks to new structure. remove TransactionCheck. Start moving business rules into the database, and provide sync_business_rules to do that, along with a mechanism to do this in tests.
This PR is an(other) attempt at bringing moving the business rules into celery.
This PR attempts to:
Build celery workflow in terms of celery workflow constructs: so celery celery can correctly schedule tasks.
Attempts to reduce the amount of work the system has to:
Much of the learning in this PR is based on the previous PR to bring use celery to run business rules, and input from the most of the team.