Skip to content

Commit

Permalink
Merge pull request #38 from timbernat/feature-cleanup
Browse files Browse the repository at this point in the history
Merge branch "feature cleanup"
  • Loading branch information
timbernat authored Jan 17, 2025
2 parents a097fb1 + 2961906 commit 11f0057
Show file tree
Hide file tree
Showing 10 changed files with 748 additions and 86 deletions.
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ Once you have a package manager installed, you may proceed with one of the provi
A fully-featured install in a safe virtual environment (named "polymerist-env", here) can be obtained by running the following terminal commands:

#### Mamba install (basic)
```sh
```bash
mamba create -n polymerist-env python=3.11
mamba activate polymerist-env
pip install polymerist
Expand All @@ -47,7 +47,7 @@ mamba install -c conda-forge openff-toolkit mbuild openbabel

#### Mamba install (extended)
An extended install with [Jupyter Notebook](https://jupyter.org/) support, molecular visualization capability, and chemical data querying capability can be obtained very similarly:
```sh
```bash
mamba create -n polymerist-env python=3.11
mamba activate polymerist-env
pip install polymerist[interactive,chemdb]
Expand All @@ -56,7 +56,7 @@ mamba install -c conda-forge openff-toolkit mbuild openbabel

#### Conda install (not recommended)
Equivalent commands using `conda` (in case `mamba` has not been installed or the user is too stubborn to use it) are given below. These will perform the same installation, just much more slowly:
```sh
```bash
conda create -n polymerist-env python=3.11
conda activate polymerist-env
pip install polymerist[interactive,chemdb]
Expand All @@ -67,7 +67,7 @@ In either case, the final [openff-toolkit](https://github.com/openforcefield/ope

#### 1.1) Testing installation
To see if the installation was successful, one can run the following short set of commands which should yield the outputs shown:
```sh
```python
mamba activate polymerist-env; python
>>> import polymerist as ps
>>> print(ps.pascal(5))
Expand All @@ -82,7 +82,7 @@ mamba activate polymerist-env; python
Assigning atomic partial charges using some flavor of [AM1-BCC](https://docs.eyesopen.com/toolkits/python/quacpactk/molchargetheory.html#am1bcc-charges) with `polymerist` also requires installation of some supplementary toolkits.

One can mix-and-match installing any combination of the toolkits below to taste or (if impatient or indifferent) opt for a "shotgun" approach and install all 3 with the following commands:
```sh
```bash
mamba activate polymerist-env
mamba install -c openeye openeye-toolkits
mamba install -c conda-forge espaloma_charge openff-nagl
Expand All @@ -102,7 +102,7 @@ This is an OpenFF-specific GNN based on similar architecture to Espaloma with a
Polymerist can also be installed directly from the source code in this repository. To install, execute the following set of terminal commands in whichever directory you'd like the installation to live on your local machine:

#### Mamba install (source)
```sh
```bash
git clone https://github.com/timbernat/polymerist
cd polymerist
mamba env create -n polymerist-env -f devtools/conda-envs/release-build.yml
Expand All @@ -111,7 +111,7 @@ pip install .
```

#### Conda install (source, not recommended)
```sh
```bash
git clone https://github.com/timbernat/polymerist
cd polymerist
conda env create -n polymerist-env -f devtools/conda-envs/release-build.yml
Expand All @@ -122,7 +122,7 @@ Once the source install is complete, you no longer need the clone of the polymer

### Developer installation (for advanced users only)
Those developing for `polymerist` may like to have an editable local installation, in which they can make changes to the source code and test behavior changes in real-time. In this case, one requires an "editable build" which mirrors the source files that live in the site_packages directory of the created environment. This type of installation proceeds as follows:
```sh
```bash
git clone https://github.com/timbernat/polymerist
cd polymerist
mamba env create -n polymerist-dev -f devtools/conda-envs/dev-build.yml
Expand Down
23 changes: 19 additions & 4 deletions polymerist/genutils/importutils/dependencies.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
__author__ = 'Timotej Bernat'
__email__ = '[email protected]'

from typing import Callable, Optional, ParamSpec, TypeVar
from typing import Callable, Optional, ParamSpec, TypeVar, Union

Params = ParamSpec('Params')
ReturnType = TypeVar('ReturnType')
Expand Down Expand Up @@ -58,7 +58,7 @@ def module_installed(module_name : str) -> bool:

try: # NOTE: opted for this implementation, as it never actually imports the package in question (faster and fewer side-effects)
return find_spec(module_name) is not None
except (ValueError, AttributeError, ModuleNotFoundError): # these could all be raised by
except (ValueError, AttributeError, ModuleNotFoundError): # these could all be raised by a missing module
return False

def modules_installed(*module_names : list[str]) -> bool:
Expand All @@ -80,7 +80,7 @@ def modules_installed(*module_names : list[str]) -> bool:

def requires_modules(
*required_module_names : list[str],
missing_module_error : type[Exception]=ImportError,
missing_module_error : Union[Exception, type[Exception]]=ImportError,
) -> Callable[[TCall[..., ReturnType]], TCall[..., ReturnType]]:
'''
Decorator which enforces optional module dependencies prior to function execution
Expand All @@ -99,12 +99,27 @@ def requires_modules(
Raised if any of the specified packages is not found to be installed
Exception message will indicate the name of the specific package found missing
'''
# meta-check to ensure type of raised Exception is valid
if not isinstance(missing_module_error, Exception):
if not (isinstance(missing_module_error, type) and issubclass(missing_module_error, Exception)):
# DEV: this is potentially brittle, depending on how the specific Exception subtype is implemented?
raise TypeError('Must pass either Exception instance or subtype to "missing_module_error')

def tailored_exception(module_name : str) -> Exception:
'''Accessory function to generate targetted Exceptions based on the provided
mssing_module_error value and the name of a module with no found installation'''
if isinstance(missing_module_error, Exception):
return missing_module_error

if isinstance(missing_module_error, type):
return missing_module_error(f'No installation found for module "{module_name}"')

def decorator(func) -> TCall[..., ReturnType]:
@wraps(func)
def req_wrapper(*args : Params.args, **kwargs : Params.kwargs) -> ReturnType:
for module_name in required_module_names:
if not module_installed(module_name):
raise missing_module_error(f'No installation found for module "{module_name}"')
raise tailored_exception(module_name)
else:
return func(*args, **kwargs)

Expand Down
46 changes: 46 additions & 0 deletions polymerist/genutils/textual/prettyprint.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,54 @@
__email__ = '[email protected]'

from typing import Any

from textwrap import indent
from enum import StrEnum


class Justification(StrEnum):
'''For specifying string justification'''
LEFT = '<'
CENTER = '^'
RIGHT = '>'
Just = Justification # alias for the lazy or hurried

def procrustean_string(
string : str,
length : int,
padding : str=' ',
just : Justification=Justification.LEFT,
) -> int:
'''Takes a string and a target length and returns a new string which begins
with the same characters as the original string but is clamped to the target length,
truncating or padding if the original string is too long or short, respectively
Parameters
----------
string : str
The string to stretch or cut
length : int
The target number of characters in the final string
padding : str, default=" "
A single character which shold be used as padding
when strings are too short, by default just a space
MUST BE EXACTLY ONE CHARACTER!
just : Justification, default=Justification.LEFT
Enum specifier of how to justify a padded string
Options are Justification.LEFT, Justification.CENTER, or Justification.RIGHT
Returns
-------
fmt_str : str
A string which begins with the same characters as "string" but has
precisely the specified length, with specified padding as specified
'''
if not (isinstance(length, int) and (length >= 0)):
raise ValueError(f'Target string length must be a non-negative integer (not {length})')
if not len(padding) == 1:
raise IndexError(f'Padding string must contain exactly one character (passed "{padding}")')

return f'{string[:length]:{padding}{just.value}{length}}'

def dict_to_indented_str(dict_to_stringify : dict[Any, Any], level_delimiter : str='\t', line_sep : str='\n') -> str:
'''Generate a pretty-printable string from a (possibly nested) dictionary,
Expand Down
4 changes: 2 additions & 2 deletions polymerist/polymers/monomers/repr.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,10 @@ def _add_monomer(self, resname : str, smarts : Smarts) -> None:
if resname in self.monomers:
existing_resgroup = self.monomers[resname]
if isinstance(existing_resgroup, list) and (smarts not in existing_resgroup):
LOGGER.info(f'Extending existing residue category "{resname}" with SMARTS {smarts}')
LOGGER.debug(f'Extending existing residue category "{resname}" with SMARTS {smarts}')
self.monomers[resname].append(smarts)
else:
LOGGER.info(f'Creating new residue category "{resname}", containing singular SMARTS ["{smarts}"])')
LOGGER.debug(f'Creating new residue category "{resname}", containing singular SMARTS ["{smarts}"])')
self.monomers[resname] = [smarts]

def _add_monomers(self, resname : str, smarts_container : Iterable[Smarts]) -> None:
Expand Down
32 changes: 24 additions & 8 deletions polymerist/polymers/monomers/specification.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,19 +19,28 @@
# CHEMICAL INFO SPECIFICATION
SANITIZE_AS_KEKULE = (Chem.SANITIZE_ALL & ~Chem.SANITIZE_SETAROMATICITY) # sanitize everything EXCEPT reassignment of aromaticity

def expanded_SMILES(smiles : str, assign_map_nums : bool=True, start_from : int=1) -> str:
'''Takes a SMILES string and clarifies chemical information, namely explicit hydrogens, kekulized aromatic bonds, and atom map numbers'''
def expanded_SMILES(
smiles : str,
assign_map_nums : bool=True,
start_from : int=1,
kekulize : bool=True,
) -> str:
'''
Expands and clarifies the chemical information contained within a passed SMILES string
namely explicit hydrogens and bond orders, and (optionally) kekulized aromatic bonds and atom map numbers
'''
assert(is_valid_SMILES(smiles))

rdmol = Chem.MolFromSmiles(smiles, sanitize=True) # TOSELF : determine values of pros/cons of sanitizations (freedom of specificity vs random RDKit errors)
rdmol = Chem.MolFromSmiles(smiles, sanitize=True)
rdmol = Chem.AddHs(rdmol, addCoords=True)
if assign_map_nums:
rdmol = molwise.assign_ordered_atom_map_nums(rdmol, start_from=start_from)

Chem.Kekulize(rdmol, clearAromaticFlags=True)

if kekulize:
Chem.Kekulize(rdmol, clearAromaticFlags=True)
Chem.SanitizeMol(rdmol)

return Chem.MolToSmiles(rdmol, kekuleSmiles=True, allBondsExplicit=True, allHsExplicit=True)
return Chem.MolToSmiles(rdmol, kekuleSmiles=kekulize, allBondsExplicit=True, allHsExplicit=True)


# REGEX TEMPLATES FOR COMPLIANT SMARTS
Expand Down Expand Up @@ -66,7 +75,14 @@ def chem_info_from_match(match : re.Match) -> dict[str, Union[int, str, None]]:


# SMARTS ATOM QUERY GENERATION
def compliant_atom_query_from_info(atomic_num : int, degree : int, atom_map_num : int, formal_charge : int=0, isotope : int=0, as_atom : bool=False) -> Union[str, QueryAtom]:
def compliant_atom_query_from_info(
atomic_num : int,
degree : int,
atom_map_num : int,
formal_charge : int=0,
isotope : int=0,
as_atom : bool=False
) -> Union[str, QueryAtom]:
'''Construct a monomer-spec compliant atom SMARTS string directly from chemical information'''
if not isotope: # handles when isotope is literal 0 or NoneType
isotope = "" # non-specific isotope is not explicitly written in string (left empty)
Expand Down Expand Up @@ -126,7 +142,7 @@ def compliant_mol_SMARTS(smarts : str) -> str:
count=rdmol.GetNumAtoms() # can't possibly replace more queries than there are atoms
)
if num_repl > 0:
LOGGER.warn(f'Cleaned {num_repl} SMARTS atom query aberrations introduced by RDKit')
LOGGER.debug(f'Cleaned {num_repl} SMARTS atom query aberrations introduced by RDKit')
sanitized_smarts = sanitized_smarts.replace('#0', '*') # replace explicit atom number 0 calls with star (easier to do post-processing, as #0 is easier to implement)

return sanitized_smarts
8 changes: 8 additions & 0 deletions polymerist/rdutils/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,11 @@

__author__ = 'Timotej Bernat'
__email__ = '[email protected]'

from .rdkdraw import (
set_rdkdraw_size,
enable_substruct_highlights,
disable_substruct_highlights,
enable_kekulized_drawing,
disable_kekulized_drawing,
)
14 changes: 11 additions & 3 deletions polymerist/rdutils/rdkdraw.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@


# GLOBAL PREFERENCES
def set_rdkdraw_size(dim : int=300, aspect : float=3/2):
'''Change image size and shape of RDKit Mol images'''
IPythonConsole.molSize = (int(aspect*dim), dim) # Change IPython image display size

def enable_substruct_highlights() -> None:
'''Turns on highlighting of found substructures when performing substructure matches'''
IPythonConsole.highlightSubstructs = True
Expand All @@ -28,9 +32,13 @@ def disable_substruct_highlights() -> None:
'''Turns off highlighting of found substructures when performing substructure matches'''
IPythonConsole.highlightSubstructs = False

def set_rdkdraw_size(dim : int=300, aspect : float=3/2):
'''Change image size and shape of RDKit Mol images'''
IPythonConsole.molSize = (int(aspect*dim), dim) # Change IPython image display size
def enable_kekulized_drawing() -> None:
'''Turns on automatic kekulization of aromatic bonds before drawing molecules in Jupyter Notebooks'''
IPythonConsole.kekulizeStructures = True

def disable_kekulized_drawing() -> None:
'''Turns off automatic kekulization of aromatic bonds before drawing molecules in Jupyter Notebooks'''
IPythonConsole.kekulizeStructures = False


# SINGLE-MOLECULE DISPLAY OPTIONS
Expand Down
Loading

0 comments on commit 11f0057

Please sign in to comment.