-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add quality control property for batch processing over genotypes #76
Add quality control property for batch processing over genotypes #76
Conversation
WalkthroughThe latest update introduces a Changes
Recent Review DetailsConfiguration used: CodeRabbit UI Files selected for processing (1)
Additional Context UsedRuff (1)
GitHub Check Runs (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Actionable comments outside the diff hunks (3)
tests/test_series.py (2)
Line range hint
2-2
: Remove the unused import ofnumpy
.- import numpy as np
Line range hint
79-79
: Rename the secondtest_series_name
function to avoid redefinition and clarify its purpose.- def test_series_name(series_instance): + def test_series_name_from_instance(series_instance):sleap_roots/series.py (1)
Line range hint
359-359
: Replace type comparison withisinstance()
for better practice.- if type(data_folders) != list: + if not isinstance(data_folders, list):
sleap_roots/series.py
Outdated
@property | ||
def qc_fail(self) -> bool: | ||
"""Flag to indicate if the series failed QC from the CSV.""" | ||
if not self.csv_path or not Path(self.csv_path).exists(): | ||
print("CSV path is not set or the file does not exist.") | ||
return np.nan | ||
df = pd.read_csv(self.csv_path) | ||
try: | ||
# Match the series_name (or plant_qr_code in the CSV) to fetch the QC flag | ||
return df[df["plant_qr_code"] == self.series_name]["qc_cylinder"].iloc[0] | ||
except IndexError: | ||
print(f"No QC flag found for series {self.series_name} in CSV.") | ||
return np.nan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the return value of qc_fail
to False
when the CSV path is not set or the QC flag is not found, to maintain consistency with the expected boolean return type.
- return np.nan
+ return False
- return np.nan
+ return False
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
@property | |
def qc_fail(self) -> bool: | |
"""Flag to indicate if the series failed QC from the CSV.""" | |
if not self.csv_path or not Path(self.csv_path).exists(): | |
print("CSV path is not set or the file does not exist.") | |
return np.nan | |
df = pd.read_csv(self.csv_path) | |
try: | |
# Match the series_name (or plant_qr_code in the CSV) to fetch the QC flag | |
return df[df["plant_qr_code"] == self.series_name]["qc_cylinder"].iloc[0] | |
except IndexError: | |
print(f"No QC flag found for series {self.series_name} in CSV.") | |
return np.nan | |
@property | |
def qc_fail(self) -> bool: | |
"""Flag to indicate if the series failed QC from the CSV.""" | |
if not self.csv_path or not Path(self.csv_path).exists(): | |
print("CSV path is not set or the file does not exist.") | |
return False | |
df = pd.read_csv(self.csv_path) | |
try: | |
# Match the series_name (or plant_qr_code in the CSV) to fetch the QC flag | |
return df[df["plant_qr_code"] == self.series_name]["qc_cylinder"].iloc[0] | |
except IndexError: | |
print(f"No QC flag found for series {self.series_name} in CSV.") | |
return False |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #76 +/- ##
==========================================
- Coverage 74.92% 74.58% -0.34%
==========================================
Files 13 13
Lines 1312 1326 +14
==========================================
+ Hits 983 989 +6
- Misses 329 337 +8 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
if not self.csv_path or not Path(self.csv_path).exists(): | ||
print("CSV path is not set or the file does not exist.") | ||
return np.nan | ||
df = pd.read_csv(self.csv_path) | ||
try: | ||
# Match the series_name (or plant_qr_code in the CSV) to fetch the QC flag | ||
return df[df["plant_qr_code"] == self.series_name]["qc_cylinder"].iloc[0] | ||
except IndexError: | ||
print(f"No QC flag found for series {self.series_name} in CSV.") | ||
return np.nan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the return value of qc_fail
to False
when the CSV path is not set or the QC flag is not found, to maintain consistency with the expected boolean return type.
- return np.nan
+ return False
- return np.nan
+ return False
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
if not self.csv_path or not Path(self.csv_path).exists(): | |
print("CSV path is not set or the file does not exist.") | |
return np.nan | |
df = pd.read_csv(self.csv_path) | |
try: | |
# Match the series_name (or plant_qr_code in the CSV) to fetch the QC flag | |
return df[df["plant_qr_code"] == self.series_name]["qc_cylinder"].iloc[0] | |
except IndexError: | |
print(f"No QC flag found for series {self.series_name} in CSV.") | |
return np.nan | |
if not self.csv_path or not Path(self.csv_path).exists(): | |
print("CSV path is not set or the file does not exist.") | |
return False | |
df = pd.read_csv(self.csv_path) | |
try: | |
# Match the series_name (or plant_qr_code in the CSV) to fetch the QC flag | |
return df[df["plant_qr_code"] == self.series_name]["qc_cylinder"].iloc[0] | |
except IndexError: | |
print(f"No QC flag found for series {self.series_name} in CSV.") | |
return False |
qc_fail
property toSeries
qc_fail
flag to remove unwanted samplesSummary by CodeRabbit
New Features
qc_fail
property in theSeries
class to indicate quality control failures during trait computations.Bug Fixes
Tests
qc_fail
property with tests to accurately identify series failing quality control.