-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement API endpoint that returns TSV report about submissions that are pending review #1443
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @cristina-stonepedraza, I think this looks great overall! the code looks clean to me and I had no issues following it.
I left some feedback about path, function, and variable naming, as well as one about the endpoint description. In case you have any questions about these, you can message me here/Slack/etc.
There is one more thing I want to see in this PR:
- An automated test
I'll send you information about that on Slack. I posted information about that below.
r""" | ||
Generate a report of NMDC submissions that are submitted pending review to allow review | ||
of submission triads and evaluate for approval | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I revised the endpoint description after looking at the code. Here's what I came up with:
r""" | |
Generate a report of NMDC submissions that are submitted pending review to allow review | |
of submission triads and evaluate for approval | |
""" | |
r""" | |
Generate a TSV-formatted report of biosamples belonging to submissions | |
that have a status of "Submitted- Pending Review". | |
The report indicates which environmental package/extension, broad scale, | |
local scale, and medium are specified for each biosample. The report is | |
designed to facilitate the review of submissions by NMDC team members. | |
""" |
@@ -637,6 +637,97 @@ async def download_zip_file( | |||
) | |||
|
|||
|
|||
@router.get( | |||
"/metadata_submission/mixs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend updating the URL path so that the final part describes what is being GET
-ed (i.e. gotten/retrieved via the endpoint). In this case, I think of it as a "report" (I'm not too familiar with MIxS and how it relates to what the report contains; my guess is that it is the name of an ontology that the values in the "environment..." fields come from).
"/metadata_submission/mixs", | |
"/metadata_submission/mixs_report", |
|
||
# Get sample names from each sample type | ||
for sample_type in sample_data: | ||
samples = sample_data[sample_type] if sample_type in sample_data else {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming samples
is typically a list (as opposed to a dictionary), I'd recommend defaulting to an empty list (instead of an empty dictionary) here.
samples = sample_data[sample_type] if sample_type in sample_data else {} | |
samples = sample_data[sample_type] if sample_type in sample_data else [] |
for x in samples: | ||
|
||
# Get the sample name | ||
name = x["samp_name"] if "samp_name" in x else {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming name
normally contains a string, I'd recommend defaulting to an empty string here (instead of an empty dictionary).
name = x["samp_name"] if "samp_name" in x else {} | |
name = x["samp_name"] if "samp_name" in x else "" |
I have the same (analogous) thought about broad
, local
, and medium
below.
# Get the env local scale | ||
local = x["env_local_scale"] if "env_local_scale" in x else {} | ||
local = str(local) | ||
local = local.replace("\t", "").replace("\r", "").replace("\n", "").lstrip("_") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer the variable not be named local
—for two reasons:
- The word "local" can have a special meaning in programming (i.e. local versus global) and I think future readers (of this code or stack traces involving this code) will find it confusing (at first glance).
- I think "local" is an adjective describing "scale". Based on that, I think the value this variable will contain is some kind of "scale".
With those things in mind, I'd recommend renaming this variable to local_scale
(or even env_local_scale
).
I have the same thought (analogous) about broad
.
@@ -686,6 +686,19 @@ def get_query_for_all_submissions(db: Session): | |||
return all_submissions | |||
|
|||
|
|||
def get_query_for_submitted_pending_review(db: Session): | |||
r""" | |||
Returns a SQLAlchemy query that can be used to retrieve submissions pending review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Thanks for including a docstring!
@@ -686,6 +686,19 @@ def get_query_for_all_submissions(db: Session): | |||
return all_submissions | |||
|
|||
|
|||
def get_query_for_submitted_pending_review(db: Session): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer this be named...
- def get_query_for_submitted_pending_review(db: Session):
+ def get_query_for_submitted_pending_review_submissions(db: Session):
...as a way of indicating that the function can be used to get submissions.
Hi @cristina-stonepedraza, here's an example of an automated test that targets an endpoint like this one: nmdc-server/tests/test_submission.py Lines 41 to 125 in 36dbba5
There are actually two tests there because the endpoint in question is only accessible to admins. One of the tests focuses on that aspect (i.e. security). By the way, I do wonder whether we will restrict access to this reporting endpoint also. You/we can discuss that with @mslarae13. |
Add a script to api.py and a query to crud.py to pull information from the submission portal database, and then generate a report of NMDC submissions that have been submitted as a TSV.
This PR references issue 2047 in nmdc-schema: microbiomedata/nmdc-schema#2047