-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add option to use pre-supplied embeddings within SMACT for the dopant_prediction module #260
Conversation
Warning Review failedThe pull request is closed. WalkthroughThe Changes
Sequence Diagram(s) (Beta)sequenceDiagram
participant User
participant Doper
participant DataLoader
participant SimilarityCalculator
User->>Doper: Initialize with embedding and use_probability
Doper->>DataLoader: Load embedding data
Doper->>SimilarityCalculator: Calculate species similarity/probability
User->>Doper: Call get_dopants
Doper->>Doper: Check use_probability flag
Doper->>SimilarityCalculator: Get similarity/probability scores
Doper->>User: Return dopant predictions
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
@coderabbitai review |
Actions performedReview triggered.
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #260 +/- ##
===========================================
+ Coverage 73.82% 74.45% +0.63%
===========================================
Files 24 24
Lines 2048 2083 +35
===========================================
+ Hits 1512 1551 +39
+ Misses 536 532 -4 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- smact/dopant_prediction/doper.py (5 hunks)
Additional context used
Ruff
smact/dopant_prediction/doper.py
3-3:
typing.Callable
imported but unused
3-3:
typing.Union
imported but unused
10-10:
smact.Element
imported but unused
10-10:
smact.element_dictionary
imported but unused
101-101: Import
groupby
from line 2 shadowed by loop variable
GitHub Check: codecov/patch
smact/dopant_prediction/doper.py
[warning] 48-48: smact/dopant_prediction/doper.py#L48
Added line #L48 was not covered by tests
[warning] 52-52: smact/dopant_prediction/doper.py#L52
Added line #L52 was not covered by tests
[warning] 54-54: smact/dopant_prediction/doper.py#L54
Added line #L54 was not covered by tests
[warning] 58-58: smact/dopant_prediction/doper.py#L58
Added line #L58 was not covered by tests
[warning] 88-89: smact/dopant_prediction/doper.py#L88-L89
Added lines #L88 - L89 were not covered by tests
[warning] 91-91: smact/dopant_prediction/doper.py#L91
Added line #L91 was not covered by tests
[warning] 202-202: smact/dopant_prediction/doper.py#L202
Added line #L202 was not covered by tests
[warning] 216-216: smact/dopant_prediction/doper.py#L216
Added line #L216 was not covered by tests
[warning] 233-233: smact/dopant_prediction/doper.py#L233
Added line #L233 was not covered by tests
[warning] 247-247: smact/dopant_prediction/doper.py#L247
Added line #L247 was not covered by tests
[warning] 344-345: smact/dopant_prediction/doper.py#L344-L345
Added lines #L344 - L345 were not covered by tests
[warning] 347-347: smact/dopant_prediction/doper.py#L347
Added line #L347 was not covered by tests
Additional comments not posted (2)
smact/dopant_prediction/doper.py (2)
47-50
: Ensure proper error handling for mutually exclusive parameters.Tools
GitHub Check: codecov/patch
[warning] 48-48: smact/dopant_prediction/doper.py#L48
Added line #L48 was not covered by tests
344-347
: Ensure that the headers into_table
method are correctly set based on theuse_probability
flag.Tools
GitHub Check: codecov/patch
[warning] 344-345: smact/dopant_prediction/doper.py#L344-L345
Added lines #L344 - L345 were not covered by tests
[warning] 347-347: smact/dopant_prediction/doper.py#L347
Added line #L347 was not covered by tests
smact/dopant_prediction/doper.py
Outdated
|
||
import numpy as np | ||
from pymatgen.util import plotting | ||
from tabulate import tabulate | ||
|
||
import smact | ||
from smact import Element, element_dictionary | ||
from smact import Element, data_directory, element_dictionary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused imports to maintain clarity and reduce overhead.
- from smact import Element, element_dictionary
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
from smact import Element, data_directory, element_dictionary | |
from smact import data_directory |
Tools
Ruff
10-10:
smact.Element
imported but unused
10-10:
smact.element_dictionary
imported but unused
smact/dopant_prediction/doper.py
Outdated
from itertools import groupby | ||
from typing import Callable, List, Tuple, Type, Union | ||
from typing import Callable, List, Optional, Tuple, Type, Union |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused imports to clean up the code.
- from typing import Callable, Union
Committable suggestion was skipped due low confidence.
Tools
Ruff
3-3:
typing.Callable
imported but unused
3-3:
typing.Union
imported but unused
smact/dopant_prediction/doper.py
Outdated
if self.use_probability: | ||
sum_prob += CM.sub_prob(cation, selected_site) | ||
else: | ||
sum_prob += CM.get_lambda(cation, selected_site) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor the repeated logic for probability and lambda calculations into a helper function to improve code maintainability.
+ def calculate_value(self, CM, specie, selected_site):
+ if self.use_probability:
+ return CM.sub_prob(specie, selected_site)
+ else:
+ return CM.get_lambda(specie, selected_site)
- if self.use_probability:
- sum_prob += CM.sub_prob(cation, selected_site)
- else:
- sum_prob += CM.get_lambda(cation, selected_site)
+ sum_prob += self.calculate_value(CM, cation, selected_site)
Also applies to: 197-204, 211-218, 228-235, 242-249
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
if self.use_probability: | |
sum_prob += CM.sub_prob(cation, selected_site) | |
else: | |
sum_prob += CM.get_lambda(cation, selected_site) | |
def calculate_value(self, CM, specie, selected_site): | |
if self.use_probability: | |
return CM.sub_prob(specie, selected_site) | |
else: | |
return CM.get_lambda(specie, selected_site) | |
sum_prob += self.calculate_value(CM, cation, selected_site) |
Tools
GitHub Check: codecov/patch
[warning] 88-89: smact/dopant_prediction/doper.py#L88-L89
Added lines #L88 - L89 were not covered by tests
[warning] 91-91: smact/dopant_prediction/doper.py#L91
Added line #L91 was not covered by tests
SKIPSSPECIES_COSINE_SIM_PATH = os.path.join( | ||
data_directory, | ||
"species_rep/skipspecies_20221028_319ion_dim200_cosine_similarity.json", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tip
Codebase Verification
The specified JSON file exists but is located at a different path than expected. Update the path in your code to reflect the correct location.
- Correct path:
smact/data/species_rep/skipspecies_20221028_319ion_dim200_cosine_similarity.json
Analysis chain
Ensure the path to the cosine similarity JSON is correct and accessible.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the existence of the cosine similarity JSON file.
# Test: Check if the file exists. Expect: File exists.
if [ -f "src/smact/data_directory/species_rep/skipspecies_20221028_319ion_dim200_cosine_similarity.json" ]; then
echo "File exists."
else
echo "File does not exist."
fi
Length of output: 158
Script:
#!/bin/bash
# Description: Search for the cosine similarity JSON file within the repository.
# Test: Use fd to locate the file. Expect: File path is found.
fd skipspecies_20221028_319ion_dim200_cosine_similarity.json
Length of output: 143
"Only one of filepath or embedding should be provided" | ||
) | ||
if embedding and embedding != "skipspecies": | ||
raise ValueError(f"Embedding {embedding} is not supported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tip
Codebase Verification
The search results indicate that there are no specific tests for embedding validation logic in the current test suite. The existing tests do not cover the scenario where an unsupported embedding raises a ValueError
.
- smact/tests/test_doper.py: Contains a test for
dopant_prediction
, but it does not seem to cover embedding validation. - Other test files and functions do not appear to be related to embedding validation.
To ensure comprehensive test coverage, please add a test case that specifically validates the embedding logic and raises a ValueError
for unsupported embeddings.
Analysis chain
Add test coverage for the embedding validation logic.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the test coverage for embedding validation logic.
# Test: Search for tests related to embedding validation. Expect: Tests exist.
rg --type python $'def test_embedding_validation'
Length of output: 81
Script:
#!/bin/bash
# Description: Verify the test coverage for embedding validation logic.
# Correct the rg command to search for test functions in Python files.
rg 'def test_' --glob '*.py'
# Additionally, search for test files that might contain relevant tests.
fd --type f --name '*test*.py'
Length of output: 3340
Script:
#!/bin/bash
# Description: Verify the test coverage for embedding validation logic.
# Correct the fd command to search for test files.
fd --type f test
# Manually inspect the test functions found by rg to see if any might cover embedding validation.
rg 'def test_' --glob '*.py'
Length of output: 3390
Tools
GitHub Check: codecov/patch
[warning] 52-52: smact/dopant_prediction/doper.py#L52
Added line #L52 was not covered by tests
smact/dopant_prediction/doper.py
Outdated
self.cation_mutator = CationMutator.from_json( | ||
SKIPSSPECIES_COSINE_SIM_PATH | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tip
Codebase Verification
The from_json
method in the CationMutator
class was not found, and no error handling was detected in the from_json
method located in smact/structure_prediction/mutation.py
. This indicates that the CationMutator.from_json
method might not handle JSON path errors properly.
- Ensure that the
CationMutator.from_json
method exists and is correctly implemented. - Verify that the method includes proper error handling for the provided JSON path.
Analysis chain
Ensure that the CationMutator.from_json
method correctly handles the provided JSON path.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the correct handling of JSON path in `CationMutator.from_json`.
# Test: Search for error handling in `CationMutator.from_json`. Expect: Proper error handling.
rg --type python $'CationMutator.from_json' | rg 'try'
Length of output: 82
Script:
#!/bin/bash
# Description: Verify the correct handling of JSON path in `CationMutator.from_json`.
# Test: Search for the `from_json` method in the `CationMutator` class and check for error handling.
ast-grep --lang python --pattern $'class CationMutator {
$$$
def from_json($_, $_) {
$$$
}
$$$
}' | rg 'try'
Length of output: 466
Script:
#!/bin/bash
# Description: Verify the correct handling of JSON path in `CationMutator.from_json`.
# Test: Search for the `from_json` method in the `CationMutator` class and check for error handling.
ast-grep --lang python --pattern $'class CationMutator:
def from_json($_, $_):
$$$' | rg 'try'
Length of output: 99
Script:
#!/bin/bash
# Description: Locate the `from_json` method in the `CationMutator` class and inspect its implementation.
# Step 1: Locate the `from_json` method in the `CationMutator` class.
ast-grep --lang python --pattern $'class CationMutator:
def from_json($_, $_):
$$$'
# Step 2: Inspect the implementation of the `from_json` method for error handling.
ast-grep --lang python --pattern $'def from_json($_, $_):
$$$' | rg 'try'
Length of output: 279
Script:
#!/bin/bash
# Description: Locate the `from_json` method across the entire codebase and inspect its implementation.
# Step 1: Locate the `from_json` method across the entire codebase.
rg --type py 'def from_json'
# Step 2: Inspect the implementation of the `from_json` method for error handling.
rg --type py 'def from_json' -A 20 | rg 'try'
Length of output: 244
Tools
GitHub Check: codecov/patch
[warning] 54-54: smact/dopant_prediction/doper.py#L54
Added line #L54 was not covered by tests
@coderabbitai review |
Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Outside diff range and nitpick comments (1)
smact/dopant_prediction/doper.py (1)
Line range hint
100-100
: Avoid shadowing the importgroupby
with a loop variable.- for groupby in zip(keys, dopants_list, groupby_list): + for group in zip(keys, dopants_list, groupby_list):Tools
GitHub Check: codecov/patch
[warning] 58-58: smact/dopant_prediction/doper.py#L58
Added line #L58 was not covered by tests
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (3)
- docs/conf.py (1 hunks)
- smact/dopant_prediction/doper.py (7 hunks)
- smact/tests/test_doper.py (2 hunks)
Files not reviewed due to errors (1)
- smact/tests/test_doper.py (no review received)
Files skipped from review due to trivial changes (1)
- docs/conf.py
Additional context used
Ruff
smact/tests/test_doper.py
4-4:
smact
imported but unused
6-6:
smact.structure_prediction.mutation
imported but unused
14-14: Local variable
num_dopants
is assigned to but never used
59-59: Local variable
test
is assigned to but never usedsmact/dopant_prediction/doper.py
100-100: Import
groupby
from line 2 shadowed by loop variable
GitHub Check: codecov/patch
smact/tests/test_doper.py
[warning] 74-74: smact/tests/test_doper.py#L74
Added line #L74 was not covered by testssmact/dopant_prediction/doper.py
[warning] 58-58: smact/dopant_prediction/doper.py#L58
Added line #L58 was not covered by tests
[warning] 88-88: smact/dopant_prediction/doper.py#L88
Added line #L88 was not covered by tests
[warning] 159-159: smact/dopant_prediction/doper.py#L159
Added line #L159 was not covered by tests
[warning] 352-353: smact/dopant_prediction/doper.py#L352-L353
Added lines #L352 - L353 were not covered by tests
[warning] 355-355: smact/dopant_prediction/doper.py#L355
Added line #L355 was not covered by tests
Additional comments not posted (1)
smact/dopant_prediction/doper.py (1)
1-1
: Remove unused import.- import smact
Likely invalid or redundant comment.
|
||
import numpy as np | ||
from pymatgen.util import plotting | ||
from tabulate import tabulate | ||
|
||
import smact | ||
from smact import Element, element_dictionary | ||
from smact import data_directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused imports.
- from smact import data_directory
- from smact.structure_prediction import mutation, utilities
Committable suggestion was skipped due low confidence.
SKIPSSPECIES_COSINE_SIM_PATH | ||
) | ||
elif filepath: | ||
self.cation_mutator = mutation.CationMutator.from_json(filepath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tip
Codebase Verification
The specific lines in doper.py
are not covered by tests.
self.cation_mutator = mutation.CationMutator.from_json(filepath)
sum_prob += self._calculate_species_sim_prob(cation, selected_site)
return self.cation_mutator.get_lambda(species1, species2)
headers = ["Rank", "Dopant", "Host", "Probability", "Selectivity"]
headers = ["Rank", "Dopant", "Host", "Similarity", "Selectivity"]
These lines were not found in the test files smact/tests/test_core.py
, smact/tests/test_doper.py
, and smact/tests/test_structure.py
.
Analysis chain
Ensure these lines are covered by tests.
Also applies to: 88-88, 159-159, 352-353, 355-355
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify test coverage for specific lines in doper.py.
# Test: Check if lines 58, 88, 159, 352-353, 355 are covered by tests. Expect: Lines are covered.
rg --type python $'self.cation_mutator = mutation.CationMutator.from_json(filepath)' --files-with-matches
rg --type python $'sum_prob += self._calculate_species_sim_prob(cation, selected_site)' --files-with-matches
rg --type python $'return self.cation_mutator.get_lambda(species1, species2)' --files-with-matches
rg --type python $'headers = ["Rank", "Dopant", "Host", "Probability", "Selectivity"]' --files-with-matches
rg --type python $'headers = ["Rank", "Dopant", "Host", "Similarity", "Selectivity"]' --files-with-matches
Length of output: 688
Script:
#!/bin/bash
# Description: Verify test coverage for specific lines in doper.py.
# Test: Check if lines 58, 88, 159, 352-353, 355 are covered by tests. Expect: Lines are covered.
rg --glob '*.py' 'self.cation_mutator = mutation.CationMutator.from_json(filepath)'
rg --glob '*.py' 'sum_prob += self._calculate_species_sim_prob(cation, selected_site)'
rg --glob '*.py' 'return self.cation_mutator.get_lambda(species1, species2)'
rg --glob '*.py' 'headers = ["Rank", "Dopant", "Host", "Probability", "Selectivity"]'
rg --glob '*.py' 'headers = ["Rank", "Dopant", "Host", "Similarity", "Selectivity"]'
Length of output: 428
Script:
#!/bin/bash
# Description: Verify test coverage for specific lines in doper.py.
# Find all test files
test_files=$(fd --glob '*test*.py')
# Check if the specific lines are covered in the test files
for test_file in $test_files; do
echo "Checking $test_file"
rg 'self.cation_mutator = mutation.CationMutator.from_json(filepath)' $test_file
rg 'sum_prob += self._calculate_species_sim_prob(cation, selected_site)' $test_file
rg 'return self.cation_mutator.get_lambda(species1, species2)' $test_file
rg 'headers = ["Rank", "Dopant", "Host", "Probability", "Selectivity"]' $test_file
rg 'headers = ["Rank", "Dopant", "Host", "Similarity", "Selectivity"]' $test_file
done
Length of output: 1938
Tools
GitHub Check: codecov/patch
[warning] 58-58: smact/dopant_prediction/doper.py#L58
Added line #L58 was not covered by tests
@coderabbitai review |
Actions performedReview triggered.
|
Description
Doper
indoper.py
to accept using pre-loaded similaritiesType of change
How Has This Been Tested?
Test Configuration:
Reviewers
Checklist:
Summary by CodeRabbit
New Features
Doper
class to support an additionalembedding
parameter and ause_probability
flag for more flexible dopant predictions.Documentation
"tabulate"
module to the list of imported modules in the documentation configuration.Tests
DopantPredictionTest
and added new test methods for dopant prediction and number formatting.