-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python: support convenience API for job-info.lookup
RPC / "flux job info"
#5265
python: support convenience API for job-info.lookup
RPC / "flux job info"
#5265
Conversation
67da331
to
197f6eb
Compare
Hm, my initial thought is that "metadata" is questionably more clear than just "info" or "data". For a job it seems like there is a fine line between "data" and "metadata" (for instance I would consider the job name, start time, working directory, etc. all metadata, which means another user could just just as confused about Since the job-info module is just a arbiter of KVS lookups, maybe it would be less confusing to call this interface |
I like the idea of calling it "job-kvs-lookup", as it would definitely give the impression this not quite the same as "job list". @cmoussa1 like the term?
Yeah, it was mostly my analness on wanting to be "consistent" on it and try to indicate "info" vs "list" is different. I'll think about it. |
If you do rename |
|
197f6eb
to
b42be47
Compare
job-info.lookup
job-info.lookup
job-info.lookup
job-info.lookup
/ "flux job info"
job-info.lookup
/ "flux job info"job-info.lookup
RPC / "flux job info"
Rebased and re-pushed per comments above
(some others, those are the big ones) I know that
Extra note: After thinking about it a bit more I decided the default of "decode" should be True. I figure it's probably the more common case. Also, in the rare case someone needs both decoded and unencoded stuff.
is probably easier than
|
b42be47
to
b857c57
Compare
oops, i forgot add a "Fixes .." into the commit message. re-pushed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, this LGTM. This looks like it will clean up the Python script I have proposed in flux-framework/flux-accounting#357. Thanks @chu11!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, too!
Problem: If a user wishes to send multiple job info lookups in parallel, they have to track which lookups belong to which job ids. This can be a tad inconvenient. Solution: Return the job id in the lookup response along with the information that was looked up.
Problem: There is no test to ensure job-info.lookup returns a jobid now. Add coverage in t2230-job-info-lookup.t.
Problem: A test in python/t0010-job.py uses the variable "meta" to describe the data returned from a job-list call. However, we use the term "info" more often due to the JobInfo class. Rename the "meta" variable to "info".
Problem: There is not a convenient API to get job information via the `job-info.lookup` RPC. Solution: Add a new "kvslookup.py" module to the Python bindings. It includes the "job_kvs_lookup()" function and the "JobKVSLookup" class for users to retrieve data via the `job-info.lookup` RPC. The module and interface is similar to the "list.py" module ( "get_job()" function and "JobList" class). Fixes flux-framework#5176
b857c57
to
7c387d1
Compare
7c387d1
to
26807fb
Compare
Problem: There is no coverage for the new flux.job.kvslookup Python module. Add coverage via new tests in t/python/t0014-job-kvslookup.py.
26807fb
to
6ce4b8e
Compare
Codecov Report
@@ Coverage Diff @@
## master #5265 +/- ##
==========================================
+ Coverage 83.33% 83.83% +0.49%
==========================================
Files 461 450 -11
Lines 78473 76011 -2462
==========================================
- Hits 65397 63724 -1673
+ Misses 13076 12287 -789
|
right before this was about to merge I realized I did tests for "bad jobid", but I didn't do tests for "bad key". Whoops. Added that extra variant. @cmoussa1 could you do a quick skim on the tests to make sure I didn't make any typos or dumb mistakes? this is the diff
|
Looks good!! thanks @chu11 |
@cmoussa1 cool thanks, setting MWP |
Oops, I belatedly realized that the |
@grondo nope I did not think to do that! I'll write up an issue. |
Per #5176, it'd be nice to have a convenience API for
job-info.lookup
as doing the rpcs and parsing out the results can be inconvenient, especially when you want to do multiple lookups.On the command line this is
flux job info
b/c of thejob-info
module, but we created the "JobInfo" class for the data coming back fromjob-list
module in the Python bindings. In hindsight perhaps an unfortunate naming.So ... I eventually settled on calling this getting "job metadata". The naming of this can be debated of course. I think it's a good name for things like "jobspec", "eventlog", "R". It's sort of a meh name for "guest.input/output". But presumably we'll someday have the solution for #4854, so maybe it doesn't matter as much on that front?
It has an API similar to
list.py
module and theJobList
class, where you can get metadata for one jobid (get_metadata()
) or a list of metadata for multiple jobids (JobMetaDataLookup
). Some differences:we want to return the "raw" metadata back to the user, not a "friendly" representation. So there is no new equivalent to the "JobInfo" class, I just return the RPC 'dict' from the call. Function names and call style adjusted as a result.
I special case process "jobspec" and "R" w/
json.loads()
, so that users get adict
instead of a string representing a JSON object. That way it removes that extra step that many people will do which is tojson.loads()
themetadata["jobspec"]
in the response. I elected to make this the default and an optionaldecode
parameter will turn it off. It could be argued the default should be the other way around since some (such asflux-accounting
and other projects) will likely want to store the jobspec as well. Any strong opinions on this? I think it's a bit of a tossup on which direction the default should be.JobList
returns "all" joblist data by default, but we don't want to do that forJobMetaData
b/c of the "infinite" maximum amount of data that can exist in several fields (i.e. "guest.input/output"). I default to just "jobspec" since I think that's the 90% use case. The 96% case is probably "jobspec" and "R" and the 99% case would probably be "jobspec" and "R" and "eventlog"/"guest.exec.eventlog". I debated which to do but settled on just "jobspec" b/c I think a default of "several things" is weird.Other notes
As we can return multiple metadata results in one list, it would be hard to figure out which data belongs to which jobs. So I added a "id" key into each response, so that the data can be associated with the appropriate job.
Because I'm anal ... I renamed "flux job info" to "flux job metadata" in the last commit ... maybe that was dumb. It does serve a small purpose. IIRC, some folks saw "flux job info" and thought it served a function similar to "flux job list" or "flux jobs", which it doesn't. I think "flux job metadata" does imply a bit more that this is "extra stuffs". But no issues if folks disagree, we can drop the commit.