diff --git a/docs/sphinx-builddir/doctrees/README.doctree b/docs/sphinx-builddir/doctrees/README.doctree
index b98d40b..ac2cc78 100644
Binary files a/docs/sphinx-builddir/doctrees/README.doctree and b/docs/sphinx-builddir/doctrees/README.doctree differ
diff --git a/docs/sphinx-builddir/doctrees/algorithms.doctree b/docs/sphinx-builddir/doctrees/algorithms.doctree
index 3a05ee7..8f1344e 100644
Binary files a/docs/sphinx-builddir/doctrees/algorithms.doctree and b/docs/sphinx-builddir/doctrees/algorithms.doctree differ
diff --git a/docs/sphinx-builddir/doctrees/descriptors.doctree b/docs/sphinx-builddir/doctrees/descriptors.doctree
index fb48723..ca62090 100644
Binary files a/docs/sphinx-builddir/doctrees/descriptors.doctree and b/docs/sphinx-builddir/doctrees/descriptors.doctree differ
diff --git a/docs/sphinx-builddir/doctrees/environment.pickle b/docs/sphinx-builddir/doctrees/environment.pickle
index 79c358f..1cc3e03 100644
Binary files a/docs/sphinx-builddir/doctrees/environment.pickle and b/docs/sphinx-builddir/doctrees/environment.pickle differ
diff --git a/docs/sphinx-builddir/doctrees/index.doctree b/docs/sphinx-builddir/doctrees/index.doctree
index 60d62d9..236c9fd 100644
Binary files a/docs/sphinx-builddir/doctrees/index.doctree and b/docs/sphinx-builddir/doctrees/index.doctree differ
diff --git a/docs/sphinx-builddir/doctrees/nbsphinx/notebooks/preprocess_data.ipynb b/docs/sphinx-builddir/doctrees/nbsphinx/notebooks/preprocess_data.ipynb
index 0dd5da5..9853aed 100644
--- a/docs/sphinx-builddir/doctrees/nbsphinx/notebooks/preprocess_data.ipynb
+++ b/docs/sphinx-builddir/doctrees/nbsphinx/notebooks/preprocess_data.ipynb
@@ -1055,10 +1055,10 @@
" response_type=\"regression\",\n",
" training_dataset_file=\"../tests/data/sdf/example.sdf\",\n",
" deduplication_strategy=KeepAllNoDeduplication(),\n",
- " log_transform=\"True\", # flags to use a transform\n",
+ " log_transform=True, # flags to use a transform\n",
" log_transform_base=LogBase.LOG10, # Log10 base will be used\n",
- " log_transform_negative=\"True\", # The negated log transform will be applied\n",
- " log_transform_unit_conversion=6 # THe unit conversion for pXC50 values is 6\n",
+ " log_transform_negative=LogNegative.TRUE, # The negated log transform will be applied\n",
+ " log_transform_unit_conversion=6 # The unit conversion for pXC50 values is 6\n",
")\n",
"\n",
"pxc50_data = Dataset(\n",
diff --git a/docs/sphinx-builddir/doctrees/notebooks/QSARtuna_Tutorial.doctree b/docs/sphinx-builddir/doctrees/notebooks/QSARtuna_Tutorial.doctree
index 1e7523c..ebaf2d7 100644
Binary files a/docs/sphinx-builddir/doctrees/notebooks/QSARtuna_Tutorial.doctree and b/docs/sphinx-builddir/doctrees/notebooks/QSARtuna_Tutorial.doctree differ
diff --git a/docs/sphinx-builddir/doctrees/notebooks/preprocess_data.doctree b/docs/sphinx-builddir/doctrees/notebooks/preprocess_data.doctree
index 04b9457..145642a 100644
Binary files a/docs/sphinx-builddir/doctrees/notebooks/preprocess_data.doctree and b/docs/sphinx-builddir/doctrees/notebooks/preprocess_data.doctree differ
diff --git a/docs/sphinx-builddir/doctrees/optunaz.config.doctree b/docs/sphinx-builddir/doctrees/optunaz.config.doctree
index 816bded..70b28b9 100644
Binary files a/docs/sphinx-builddir/doctrees/optunaz.config.doctree and b/docs/sphinx-builddir/doctrees/optunaz.config.doctree differ
diff --git a/docs/sphinx-builddir/doctrees/optunaz.doctree b/docs/sphinx-builddir/doctrees/optunaz.doctree
index 9fed391..095c256 100644
Binary files a/docs/sphinx-builddir/doctrees/optunaz.doctree and b/docs/sphinx-builddir/doctrees/optunaz.doctree differ
diff --git a/docs/sphinx-builddir/doctrees/optunaz.utils.doctree b/docs/sphinx-builddir/doctrees/optunaz.utils.doctree
index ff5c180..e72ae3b 100644
Binary files a/docs/sphinx-builddir/doctrees/optunaz.utils.doctree and b/docs/sphinx-builddir/doctrees/optunaz.utils.doctree differ
diff --git a/docs/sphinx-builddir/doctrees/transform.doctree b/docs/sphinx-builddir/doctrees/transform.doctree
index 905f17d..c0512b3 100644
Binary files a/docs/sphinx-builddir/doctrees/transform.doctree and b/docs/sphinx-builddir/doctrees/transform.doctree differ
diff --git a/docs/sphinx-builddir/html/.buildinfo b/docs/sphinx-builddir/html/.buildinfo
index cca6452..7f97108 100644
--- a/docs/sphinx-builddir/html/.buildinfo
+++ b/docs/sphinx-builddir/html/.buildinfo
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
-config: b0c96d5f832cdd38d138271da4ddabf1
+config: cd8105a3e394e54ace4ef41b1def0634
tags: 645f666f9bcd5a90fca523b33c5a78b7
diff --git a/docs/sphinx-builddir/html/README.html b/docs/sphinx-builddir/html/README.html
index f4998e9..1fb7f97 100644
--- a/docs/sphinx-builddir/html/README.html
+++ b/docs/sphinx-builddir/html/README.html
@@ -4,7 +4,7 @@
-
QSARtuna 𓆛: QSAR using Optimization for Hyperparameter Tuning (formerly Optuna AZ and QPTUNA) — QSARtuna 3.1.2 documentation
+ QSARtuna 𓆛: QSAR using Optimization for Hyperparameter Tuning (formerly Optuna AZ and QPTUNA) — QSARtuna 3.1.3 documentation
@@ -274,16 +274,14 @@
[docs]defcalibration_analysis(y_test,y_pred):
+ try:
+ frac_true,frac_pred=calibration_curve(y_test,y_pred,n_bins=15)
+ bin_edges=frac_pred
+ exceptValueError:
+ # weight each bin by the total number of values so that the sum of all bars equal unity
+ weights=np.ones_like(y_test)/len(y_test)
+ # calculate fraction of true points across uniform bins
+ frac_true,bin_edges=np.histogram(y_test,bins=15,weights=weights)
+ # calculate fraction of pred points across uniform true bins
+ frac_pred,_=np.histogram(y_pred,bins=bin_edges,weights=weights)
+ # convert to cumulative sum for plotting
+ frac_true=np.cumsum(frac_true)
+ frac_pred=np.cumsum(frac_pred)
+ returnlist(zip(bin_edges,frac_true,frac_pred))
Source code for optunaz.three_step_opt_build_merge
"""Step 2. Build. Train a model with the best hyperparameters."""model,train_scores,test_scores=build(buildconfig,cache=cache)
- save_model(
+ qsartuna_model=save_model(model,buildconfig,outfname,
@@ -379,7 +379,7 @@
Source code for optunaz.three_step_opt_build_merge
[docs]@dataclassclassAmorProt(AuxTransformer):"""AmorProt from column Calculates AmorProt for sequences or a predefined list of peptide/protein targets"""
-
These MAPC descriptors are unscaled and should be used with caution. MinHashed Atom-Pair Fingerprint Chiral (see
+Orsi et al. One chiral fingerprint to find them all) is the original version of the MinHashed Atom-Pair
+fingerprint of radius 2 (MAP4) which combined circular substructure fingerprints and atom-pair fingerprints into
+a unified framework. This combination allowed for improved substructure perception and performance in small
+molecule benchmarks while retaining information about bond distances for molecular size and shape perception.
+
These fingerprints expand the functionality of MAP4 to include encoding of stereochemistry into the fingerprint.
+CIP descriptors of chiral atoms are encoded into the fingerprint at the highest radius. This allows MAPC
+to modulate the impact of stereochemistry on fingerprints, making it scale with increasing molecular size
+without disproportionally affecting structural fingerprints/similarity.
MAPC (MinHashed Atom-Pair Fingerprint Chiral) (see Orsi et al. One chiral fingerprint to find them all) is the
+original version of the MinHashed Atom-Pair fingerprint of radius 2 (MAP4) which combined circular substructure
+fingerprints and atom-pair fingerprints into a unified framework. This combination allowed for improved
+substructure perception and performance in small molecule benchmarks while retaining information about bond
+distances for molecular size and shape perception.
+
These fingerprints expand the functionality of MAP4 to include encoding of stereochemistry into the fingerprint.
+CIP descriptors of chiral atoms are encoded into the fingerprint at the highest radius. This allows MAPC
+to modulate the impact of stereochemistry on fingerprints, making it scale with increasing molecular size
+without disproportionally affecting structural fingerprints/similarity.
Z-scales were proposed in Sandberg et al (1998) based on physicochemical properties of proteogenic and
+non-proteogenic amino acids, including NMR data and thin-layer chromatography (TLC) data. Refer to
+doi:10.1021/jm9700575 for the original publication. These descriptors capture 1. lipophilicity, 2. steric
+properties (steric bulk and polarizability), 3. electronic properties (polarity and charge),
+4. electronegativity (heat of formation, electrophilicity and hardness) and 5. another electronegativity.
+This fingerprint is the computed average of Z-scales of all the amino acids in the peptide.
+
diff --git a/docs/sphinx-source/index.rst b/docs/sphinx-source/index.rst
index 43864b0..e48f662 100644
--- a/docs/sphinx-source/index.rst
+++ b/docs/sphinx-source/index.rst
@@ -28,4 +28,4 @@ Development
-----------
* `Test report <_static/pytest/pytest/index.html>`_
* `Test coverage <_static/pytest/coverage/index.html>`_
-* `Public release (3.1.2) `_
\ No newline at end of file
+* `Public release (3.1.3) `_
\ No newline at end of file