-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fetch-job-records
: add integrity check for records
#475
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just one suggestion inline.
"jobspec", | ||
] | ||
if not all( | ||
key in single_record and single_record[key] is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a suggestion, but I think you check if key in single_record and single_record[key] is not None
in one call with single_record.get(key)
(returns falsy if key doesn't exist or is None
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope I'm not misunderstanding you but I think the check would have to be single_record.get(key) is not None
: https://docs.python.org/3/library/stdtypes.html#dict.get
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, maybe you're right, assuming that all the values are truthy if they aren't None
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope I'm not misunderstanding you but I think the check would have to be single_record.get(key) is not None: https://docs.python.org/3/library/stdtypes.html#dict.get
Very good point @jameshcorbett - I should keep my mouth shut about Python stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
booo everyone is wrong sometimes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
"jobspec", | ||
] | ||
if not all( | ||
key in single_record and single_record[key] is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope I'm not misunderstanding you but I think the check would have to be single_record.get(key) is not None
: https://docs.python.org/3/library/stdtypes.html#dict.get
Problem: There is a case where a job can have an R but not have entered RUN state, and thus, does not receive a "t_run" time. The "flux account-fetch-job-records" script does not verify that the timestamp key-value pairs exist before attempting to insert them into the "jobs" table in the flux-accounting DB, which can result in a KeyError. Add two integrity checks for job records fetched by this script. - When initially fetching jobs that checks that each key-value pair exists before adding it to a list of fetched job records, if a key-value pair is not valid, skip adding the job. - When attempting to insert a job record into the "jobs" table, wrap the INSERT in a try-except block in case a KeyError is raised. If so, just skip adding the job.
ecf252a
to
862b03c
Compare
Thanks @jameshcorbett and @grondo for the catch and for approving! I'll set MWP here |
Problem
There is a case where a job can have an
R
but not have enteredRUN
state, and thus, does not receive at_run
time. Theflux account-fetch-job-records
script does not verify that the timestamp key-value pairs exist before attempting to insert them into thejobs
table in the flux-accounting DB, which can result in aKeyError
.This PR adds two integrity checks for job records fetched by this script:
When initially fetching jobs that checks that each key-value pair exists before adding it to a list of fetched job records, if a key-value pair is not valid, skip adding the job.
When attempting to insert a job record into the
jobs
table, wrap theINSERT
in a try-except block in case aKeyError
is raised. If so, just skip adding the job.