Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python: support convenience API for job-info.lookup RPC / "flux job info" #5265

Merged
merged 5 commits into from
Jun 22, 2023

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Jun 14, 2023

Per #5176, it'd be nice to have a convenience API for job-info.lookup as doing the rpcs and parsing out the results can be inconvenient, especially when you want to do multiple lookups.

On the command line this is flux job info b/c of the job-info module, but we created the "JobInfo" class for the data coming back from job-list module in the Python bindings. In hindsight perhaps an unfortunate naming.

So ... I eventually settled on calling this getting "job metadata". The naming of this can be debated of course. I think it's a good name for things like "jobspec", "eventlog", "R". It's sort of a meh name for "guest.input/output". But presumably we'll someday have the solution for #4854, so maybe it doesn't matter as much on that front?

It has an API similar to list.py module and the JobList class, where you can get metadata for one jobid (get_metadata()) or a list of metadata for multiple jobids (JobMetaDataLookup). Some differences:

  • we want to return the "raw" metadata back to the user, not a "friendly" representation. So there is no new equivalent to the "JobInfo" class, I just return the RPC 'dict' from the call. Function names and call style adjusted as a result.

  • I special case process "jobspec" and "R" w/ json.loads(), so that users get a dict instead of a string representing a JSON object. That way it removes that extra step that many people will do which is to json.loads() the metadata["jobspec"] in the response. I elected to make this the default and an optional decode parameter will turn it off. It could be argued the default should be the other way around since some (such as flux-accounting and other projects) will likely want to store the jobspec as well. Any strong opinions on this? I think it's a bit of a tossup on which direction the default should be.

  • JobList returns "all" joblist data by default, but we don't want to do that for JobMetaData b/c of the "infinite" maximum amount of data that can exist in several fields (i.e. "guest.input/output"). I default to just "jobspec" since I think that's the 90% use case. The 96% case is probably "jobspec" and "R" and the 99% case would probably be "jobspec" and "R" and "eventlog"/"guest.exec.eventlog". I debated which to do but settled on just "jobspec" b/c I think a default of "several things" is weird.

Other notes

  • As we can return multiple metadata results in one list, it would be hard to figure out which data belongs to which jobs. So I added a "id" key into each response, so that the data can be associated with the appropriate job.

  • Because I'm anal ... I renamed "flux job info" to "flux job metadata" in the last commit ... maybe that was dumb. It does serve a small purpose. IIRC, some folks saw "flux job info" and thought it served a function similar to "flux job list" or "flux jobs", which it doesn't. I think "flux job metadata" does imply a bit more that this is "extra stuffs". But no issues if folks disagree, we can drop the commit.

@chu11 chu11 force-pushed the issue5176_python_job_info_lookup branch from 67da331 to 197f6eb Compare June 15, 2023 04:43
@grondo
Copy link
Contributor

grondo commented Jun 15, 2023

Hm, my initial thought is that "metadata" is questionably more clear than just "info" or "data". For a job it seems like there is a fine line between "data" and "metadata" (for instance I would consider the job name, start time, working directory, etc. all metadata, which means another user could just just as confused about flux job metadata vs flux job info.)

Since the job-info module is just a arbiter of KVS lookups, maybe it would be less confusing to call this interface job_kvs_lookup and have a similarly named synonym for flux job info, e.g. flux job kvs-lookup if you really want to (I wouldn't consider it necessary at this point, and renaming all the internals is unnecessary churn IMO), thus sidestepping the info vs data vs metadata arguments?

@chu11
Copy link
Member Author

chu11 commented Jun 15, 2023

Since the job-info module is just a arbiter of KVS lookups, maybe it would be less confusing to call this interface job_kvs_lookup and have a similarly named synonym for flux job info, e.g. flux job kvs-lookup if you really want to

I like the idea of calling it "job-kvs-lookup", as it would definitely give the impression this not quite the same as "job list". @cmoussa1 like the term?

(I wouldn't consider it necessary at this point, and renaming all the internals is unnecessary churn IMO), thus sidestepping the info vs data vs metadata arguments?

Yeah, it was mostly my analness on wanting to be "consistent" on it and try to indicate "info" vs "list" is different. I'll think about it.

@grondo
Copy link
Contributor

grondo commented Jun 15, 2023

If you do rename flux job info and internal related functions (which I suggest against at this point 😜 ) I think that should go in a separate PR because o/w a big change would be hidden in the release notes, and it doesn't really fall under the title "python: support ..." :-)

@cmoussa1
Copy link
Member

job-kvs-lookup sounds good to me too 👍

@chu11 chu11 force-pushed the issue5176_python_job_info_lookup branch from 197f6eb to b42be47 Compare June 15, 2023 22:13
@chu11 chu11 changed the title python: support "metadata" convenience functions for job-info.lookup python: support convenience functions for job-info.lookup Jun 15, 2023
@chu11 chu11 changed the title python: support convenience functions for job-info.lookup python: support convenience API for job-info.lookup / "flux job info" Jun 15, 2023
@chu11 chu11 changed the title python: support convenience API for job-info.lookup / "flux job info" python: support convenience API for job-info.lookup RPC / "flux job info" Jun 15, 2023
@chu11
Copy link
Member Author

chu11 commented Jun 15, 2023

Rebased and re-pushed per comments above

  1. There have been some renamings, notably:
`metadata.py` -> `kvslookup.py`
`JobMetaDataLookup` -> `JobKVSLookup`
`JobMetaDataLookup.fetch_metadata()` -> `JobKVSLookup.fetch_data()`
`JobMetaDataLookup.metadata()` -> `JobKVSLookup.data()`
`get_metadata()` -> `job_kvs_lookup()`

(some others, those are the big ones)

I know that fetch_data() and data() functions sound super generic. I didn't want to do fetch_jobs() and jobs() like in JobList, because "jobs" didn't quite sound right for this class. And "info" wasn't right either. Can tweak if someone has a much better name they can think of.

  1. I removed the commit renaming flux job info -> flux job metadata. While working on this update, I am leaning towards "not necessary" now.

  2. Updated tests accordingly. Also tweaked tests, I realized that the order of the returned job data should be in the order it is passed in.

  3. Updated PR title

Extra note:

After thinking about it a bit more I decided the default of "decode" should be True. I figure it's probably the more common case. Also, in the rare case someone needs both decoded and unencoded stuff.

specific_thing_I_want = data["jobspec"]["foo"]["bar"]["baz"]
jobspec_str = str(data["jobspec"]
store_jobspec_somewhere (jobspec_str)

is probably easier than

jobspec_dict = json.loads(data["jobspec"])
specific_thing_I_want = jobspec_dict["foo"]["bar"]["baz"]
store_jobspec_somewhere (data["jobspec"])

@chu11 chu11 force-pushed the issue5176_python_job_info_lookup branch from b42be47 to b857c57 Compare June 15, 2023 23:42
@chu11
Copy link
Member Author

chu11 commented Jun 15, 2023

oops, i forgot add a "Fixes .." into the commit message. re-pushed

Copy link
Member

@cmoussa1 cmoussa1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this LGTM. This looks like it will clean up the Python script I have proposed in flux-framework/flux-accounting#357. Thanks @chu11!

Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, too!

@chu11
Copy link
Member Author

chu11 commented Jun 22, 2023

@grondo @cmoussa1 thanks, will set MWP

chu11 added 4 commits June 22, 2023 13:37
Problem: If a user wishes to send multiple job info lookups in parallel,
they have to track which lookups belong to which job ids.  This can be
a tad inconvenient.

Solution: Return the job id in the lookup response along with the
information that was looked up.
Problem: There is no test to ensure job-info.lookup returns a
jobid now.

Add coverage in t2230-job-info-lookup.t.
Problem: A test in python/t0010-job.py uses the variable "meta"
to describe the data returned from a job-list call.  However, we use
the term "info" more often due to the JobInfo class.

Rename the "meta" variable to "info".
Problem: There is not a convenient API to get job information
via the `job-info.lookup` RPC.

Solution: Add a new "kvslookup.py" module to the Python bindings.
It includes the "job_kvs_lookup()" function and the "JobKVSLookup"
class for users to retrieve data via the `job-info.lookup` RPC.
The module and interface is similar to the "list.py" module (
"get_job()" function and "JobList" class).

Fixes flux-framework#5176
@chu11 chu11 force-pushed the issue5176_python_job_info_lookup branch from b857c57 to 7c387d1 Compare June 22, 2023 20:37
@chu11 chu11 force-pushed the issue5176_python_job_info_lookup branch from 7c387d1 to 26807fb Compare June 22, 2023 21:50
Problem: There is no coverage for the new flux.job.kvslookup
Python module.

Add coverage via new tests in t/python/t0014-job-kvslookup.py.
@chu11 chu11 force-pushed the issue5176_python_job_info_lookup branch from 26807fb to 6ce4b8e Compare June 22, 2023 21:51
@codecov
Copy link

codecov bot commented Jun 22, 2023

Codecov Report

Merging #5265 (67da331) into master (1008094) will increase coverage by 0.49%.
The diff coverage is 94.79%.

❗ Current head 67da331 differs from pull request most recent head 6ce4b8e. Consider uploading reports for the commit 6ce4b8e to get more accurate results

@@            Coverage Diff             @@
##           master    #5265      +/-   ##
==========================================
+ Coverage   83.33%   83.83%   +0.49%     
==========================================
  Files         461      450      -11     
  Lines       78473    76011    -2462     
==========================================
- Hits        65397    63724    -1673     
+ Misses      13076    12287     -789     
Impacted Files Coverage Δ
src/modules/job-info/lookup.c 67.79% <60.00%> (-0.23%) ⬇️
src/bindings/python/flux/job/metadata.py 96.29% <96.29%> (ø)
src/bindings/python/flux/job/__init__.py 100.00% <100.00%> (ø)
src/cmd/flux-job.c 87.92% <100.00%> (+0.12%) ⬆️

... and 124 files with indirect coverage changes

@chu11
Copy link
Member Author

chu11 commented Jun 22, 2023

right before this was about to merge I realized I did tests for "bad jobid", but I didn't do tests for "bad key". Whoops. Added that extra variant.

@cmoussa1 could you do a quick skim on the tests to make sure I didn't make any typos or dumb mistakes? this is the diff

index 30fbdc4..8bb5f11 100755
--- a/t/python/t0014-job-kvslookup.py
+++ b/t/python/t0014-job-kvslookup.py
@@ -93,73 +93,92 @@ class TestJob(unittest.TestCase):
             notfound = True
         self.assertEqual(notfound, True)
 
-    def test_03_job_kvs_lookup(self):
+    def test_03_job_info_lookup_badkey(self):
+        rpc = flux.job.job_info_lookup(self.fh, self.jobid1, keys=["foo"])
+        notfound = False
+        try:
+            rpc.get()
+        except FileNotFoundError:
+            notfound = True
+        self.assertEqual(notfound, True)
+
+    def test_04_job_kvs_lookup(self):
         data = flux.job.job_kvs_lookup(self.fh, self.jobid1)
         self.check_jobspec_decoded(data, self.jobid1)
 
-    def test_04_job_kvs_lookup_nodecode(self):
+    def test_05_job_kvs_lookup_nodecode(self):
         data = flux.job.job_kvs_lookup(self.fh, self.jobid1, decode=False)
         self.check_jobspec_str(data, self.jobid1)
 
-    def test_05_job_kvs_lookup_keys(self):
+    def test_06_job_kvs_lookup_keys(self):
         data = flux.job.job_kvs_lookup(self.fh, self.jobid1, keys=["R", "J"])
         self.check_R_J_decoded(data, self.jobid1)
 
-    def test_06_job_kvs_lookup_keys_nodecode(self):
+    def test_07_job_kvs_lookup_keys_nodecode(self):
         data = flux.job.job_kvs_lookup(
             self.fh, self.jobid1, keys=["R", "J"], decode=False
         )
         self.check_R_J_str(data, self.jobid1)
 
-    def test_07_job_kvs_lookup_badid(self):
+    def test_08_job_kvs_lookup_badid(self):
         data = flux.job.job_kvs_lookup(self.fh, 123456789)
         self.assertEqual(data, None)
 
-    def test_08_job_kvs_lookup_list(self):
+    def test_09_job_kvs_lookup_badkey(self):
+        data = flux.job.job_kvs_lookup(self.fh, self.jobid1, keys=["foo"])
+        self.assertEqual(data, None)
+
+    def test_10_job_kvs_lookup_list(self):
         ids = [self.jobid1]
         data = flux.job.JobKVSLookup(self.fh, ids).data()
         self.assertEqual(len(data), 1)
         self.check_jobspec_decoded(data[0], self.jobid1)
 
-    def test_09_job_kvs_lookup_list_multiple(self):
+    def test_11_job_kvs_lookup_list_multiple(self):
         ids = [self.jobid1, self.jobid2]
         data = flux.job.JobKVSLookup(self.fh, ids).data()
         self.assertEqual(len(data), 2)
         self.check_jobspec_decoded(data[0], self.jobid1)
         self.check_jobspec_decoded(data[1], self.jobid2)
 
-    def test_10_job_kvs_lookup_list_multiple_nodecode(self):
+    def test_12_job_kvs_lookup_list_multiple_nodecode(self):
         ids = [self.jobid1, self.jobid2]
         data = flux.job.JobKVSLookup(self.fh, ids, decode=False).data()
         self.assertEqual(len(data), 2)
         self.check_jobspec_str(data[0], self.jobid1)
         self.check_jobspec_str(data[1], self.jobid2)
 
-    def test_11_job_kvs_lookup_list_multiple_keys(self):
+    def test_13_job_kvs_lookup_list_multiple_keys(self):
         ids = [self.jobid1, self.jobid2]
         data = flux.job.JobKVSLookup(self.fh, ids, keys=["R", "J"]).data()
         self.assertEqual(len(data), 2)
         self.check_R_J_decoded(data[0], self.jobid1)
         self.check_R_J_decoded(data[1], self.jobid2)
 
-    def test_12_job_kvs_lookup_list_multiple_keys_nodecode(self):
+    def test_14_job_kvs_lookup_list_multiple_keys_nodecode(self):
         ids = [self.jobid1, self.jobid2]
         data = flux.job.JobKVSLookup(self.fh, ids, keys=["R", "J"], decode=False).data()
         self.assertEqual(len(data), 2)
         self.check_R_J_str(data[0], self.jobid1)
         self.check_R_J_str(data[1], self.jobid2)
 
-    def test_13_job_kvs_lookup_list_none(self):
+    def test_15_job_kvs_lookup_list_none(self):
         data = flux.job.JobKVSLookup(self.fh).data()
         self.assertEqual(len(data), 0)
 
-    def test_14_job_kvs_lookup_list_invalid(self):
+    def test_16_job_kvs_lookup_list_badid(self):
         ids = [123456789]
         datalookup = flux.job.JobKVSLookup(self.fh, ids)
         data = datalookup.data()
         self.assertEqual(len(data), 0)
         self.assertEqual(len(datalookup.errors), 1)
 
+    def test_17_job_kvs_lookup_list_badkey(self):
+        ids = [self.jobid1]
+        datalookup = flux.job.JobKVSLookup(self.fh, ids, keys=["foo"])
+        data = datalookup.data()
+        self.assertEqual(len(data), 0)
+        self.assertEqual(len(datalookup.errors), 1)
 
 if __name__ == "__main__":
     from subflux import rerun_under_flux

@cmoussa1
Copy link
Member

Looks good!! thanks @chu11

@chu11
Copy link
Member Author

chu11 commented Jun 22, 2023

@cmoussa1 cool thanks, setting MWP

@mergify mergify bot merged commit 49c59b5 into flux-framework:master Jun 22, 2023
@grondo
Copy link
Contributor

grondo commented Jun 22, 2023

Oops, I belatedly realized that the job_kvs_lookup() function may not have an interface to fetch the original jobspec, and instead only returns the redacted jobspec (see flux job info --original JOBID jobspec implementation). Maybe I just missed it though?

@chu11
Copy link
Member Author

chu11 commented Jun 22, 2023

@grondo nope I did not think to do that! I'll write up an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants