-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ONT FCs to bioinfo tab #446
Add ONT FCs to bioinfo tab #446
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #446 +/- ##
=========================================
Coverage ? 27.83%
=========================================
Files ? 37
Lines ? 5489
Branches ? 0
=========================================
Hits ? 1528
Misses ? 3961
Partials ? 0 ☔ View full report in Codecov by Sentry. |
taca/utils/bioinfo_tab.py
Outdated
flowcell_info = ( | ||
couch_connection["nanopore_runs"].view("info/lims")[flowcell_id].rows[0] | ||
) | ||
if flowcell_info.value and "sample_data" in flowcell_info.value["loading"][0]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would like to see a case that catches the case where "loading" is not inside value and if it's an empty list. Even if it's "always" supposed to be there, I think it would be a shame if it broke the entire script run if it wasn't there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case where the db .json has a "lims" nest, there should always be a subnest of "loading", and possibly "reloading". The db .json will not contain the "lims" nest until the LIMS step has been completed though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Easy enough to add a check that the key exists here I think.
taca/utils/bioinfo_tab.py
Outdated
elif inst_brand == "ont": | ||
base_name = os.path.basename(os.path.abspath(run_dir)) | ||
# Skip archived, no_backup, nosync and qc folders | ||
if base_name in ["archived", "no_backup", "nosync", "qc"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't base_name
here be the name of the run, and not the dir in which it resides?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see now. We are passing everything in the datadir into this function, including subdirs like nosync
and qc
. I think it would be cleaner if we have some ealier logic to make sure only folders that look like ONT run dirs are passed into this function. The ONT run regex is available in the repo.
taca/utils/bioinfo_tab.py
Outdated
couch_connection["nanopore_runs"].view("info/lims")[flowcell_id].rows[0] | ||
) | ||
if flowcell_info.value and "sample_data" in flowcell_info.value["loading"][0]: | ||
samples = flowcell_info.value["loading"][0]["sample_data"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the index should be [-1]
, not [0]
. If the LIMS script is re-run manually, we should use the latest values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with the potential downstream issues, but this seems like a good start for the integration. Nice work 👍
See comments for requested changes.
Forgot to commit fix in PR: Add ONT FCs to bioinfo tab #446
No description provided.