Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Voroscoring #867

Draft
wants to merge 33 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
05ffc37
first draft voroscoring module
VGPReys Apr 17, 2024
1c8e160
fix writing
VGPReys Apr 18, 2024
c37370d
fix types
VGPReys Apr 18, 2024
ef856c9
upgrade haddock module init
VGPReys Apr 18, 2024
75a69ad
import Any type
VGPReys Apr 18, 2024
8ba5ae7
redefine scoring modules classes
VGPReys Apr 18, 2024
2320357
add output attribute
VGPReys Apr 18, 2024
6757ae9
output Path type
VGPReys Apr 18, 2024
8601649
output var name
VGPReys Apr 18, 2024
e1ab6c8
get base_workdir from class attribute
VGPReys Apr 18, 2024
44b41e0
Path to str for .join() method
VGPReys Apr 18, 2024
1e9d612
voro scoring example
VGPReys Apr 18, 2024
042f484
output tsv file writing from self.output()
VGPReys Apr 18, 2024
319056d
remove import
VGPReys Apr 18, 2024
5c05779
solve error in recombine arguments
VGPReys Apr 18, 2024
0dd02ba
finetunings
VGPReys Apr 18, 2024
b431e6a
tidy types and lints
VGPReys Apr 18, 2024
d8a1322
reversing scores for systematic ascenting sorting
VGPReys Apr 18, 2024
994bea4
adding tests
VGPReys Apr 19, 2024
69245cd
fixed conflict
mgiulini Apr 29, 2024
f832c0c
add header information in generated final output file
VGPReys Apr 29, 2024
a5a3656
make sure line is terminated
VGPReys Apr 30, 2024
ac409ac
fix tuple error
VGPReys May 3, 2024
60c9ee6
Merge remote-tracking branch 'origin' into voroscoring
VGPReys May 3, 2024
795682a
tests
VGPReys May 21, 2024
1e0d578
Merge branch 'main' into voroscoring
VGPReys May 21, 2024
7e8ac73
change
ntxxt May 30, 2024
4370daf
fixes to adapt to new hardware
VGPReys Jun 17, 2024
80e112e
mergeing main and solve conflicts
VGPReys Jun 20, 2024
fbe08c0
Merge branch 'main' into voroscoring
rvhonorato Aug 13, 2024
e7055a8
Merge branch 'main' into voroscoring
VGPReys Sep 10, 2024
4d6253b
check
ntxxt Sep 18, 2024
9791476
Merge branch 'voroscoring' of https://github.com/haddocking/haddock3 …
ntxxt Sep 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions examples/scoring/voroscoring-test.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# ====================================================================
# Scoring example

# directory in which the scoring will be done
run_dir = "run1-voroscoring-test"
clean = false

# execution mode
ncores = 3
mode = "local"

# ensemble of different complexes to be scored
molecules = ["data/T161-rescoring-ens.pdb",
"data/HY3.pdb",
"data/protein-dna_1w.pdb",
"data/protein-protein_1w.pdb",
"data/protein-protein_2w.pdb",
"data/protein-trimer_1w.pdb"
]

# ====================================================================
# Parameters for each stage are defined below

[topoaa]

[voroscoring]

[seletop]
select = 3

[caprieval]

# ====================================================================
22 changes: 18 additions & 4 deletions src/haddock/modules/scoring/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
"""HADDOCK3 modules to score models."""
from os import linesep
import pandas as pd

from haddock.core.typing import FilePath, Path, Any
from haddock.core.typing import FilePath, Path, Any, Optional
from haddock.modules.base_cns_module import BaseCNSModule
from haddock.modules import BaseHaddockModule, PDBFile

Expand All @@ -14,6 +15,7 @@ def output(
output_fname: FilePath,
sep: str = "\t",
ascending_sort: bool = True,
header_comments: Optional[str] = None,
) -> None:
r"""Save the output in comprehensive tables.

Expand All @@ -36,9 +38,21 @@ def output(
df_sc = pd.DataFrame(sc_data, columns=df_columns)
df_sc_sorted = df_sc.sort_values(by="score", ascending=ascending_sort)
# writes to disk
df_sc_sorted.to_csv(output_fname, sep=sep, index=False, na_rep="None")

return
output_file = open(output_fname, 'a')
# Check if some comment in header are here
if header_comments:
# Make sure the comments is ending by a new line
if header_comments[-1] != linesep:
header_comments += linesep
output_file.write(header_comments)
# Write the dataframe
df_sc_sorted.to_csv(
output_file,
sep=sep,
index=False,
na_rep="None",
lineterminator=linesep,
)


class CNSScoringModule(BaseCNSModule, ScoringModule):
Expand Down
5 changes: 3 additions & 2 deletions src/haddock/modules/scoring/emscoring/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
"""EM scoring module.

This module performs energy minimization and scoring of the models generated in
the previous step of the workflow. No restraints are applied during this step.
This module performs energy minimization and scoring of the models generated
in the previous step of the workflow.
Note that no restraints (AIRs) are applied during this step.
"""

from pathlib import Path
Expand Down
3 changes: 2 additions & 1 deletion src/haddock/modules/scoring/mdscoring/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
"""MD scoring module.

This module will perform a short MD simulation on the input models and
score them. No restraints are applied during this step.
score them.
Note that no restraints (AIRs) are applied during this step.
"""

from pathlib import Path
Expand Down
93 changes: 93 additions & 0 deletions src/haddock/modules/scoring/voroscoring/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
"""Voro scoring module.

This module performs scoring of input pdb models using ftdmp voro-mqa-all tool.
For more information, please check: https://github.com/kliment-olechnovic/ftdmp

It is a third party module, and requires the appropriate set up and intallation
for it to run without issue.
"""
from os import linesep
from pathlib import Path

from haddock.core.typing import Any, FilePath
from haddock.modules import get_engine
from haddock.modules.scoring import ScoringModule
from haddock.modules.scoring.voroscoring.voroscoring import (
VoroMQA,
update_models_with_scores,
)

RECIPE_PATH = Path(__file__).resolve().parent
DEFAULT_CONFIG = Path(RECIPE_PATH, "defaults.yaml")


class HaddockModule(ScoringModule):
"""."""

name = RECIPE_PATH.name

def __init__(
self,
order: int,
path: Path,
*ignore: Any,
init_params: FilePath = DEFAULT_CONFIG,
**everything: Any,
) -> None:
"""Initialize class."""
super().__init__(order, path, init_params)

@classmethod
def confirm_installation(cls) -> None:
"""Confirm module is installed."""
# FIXME ? Check :
# - if conda env is accessible
# - if ftdmp is accessible
return

def _run(self) -> None:
"""Execute module."""
# Retrieve previous models
try:
models_to_score = self.previous_io.retrieve_models(
individualize=True
)
except Exception as e:
self.finish_with_error(e)

# Initiate VoroMQA object
output_fname = Path("voro_mqa_all.tsv")
voromqa = VoroMQA(
models_to_score,
'./',
self.params,
output=output_fname,
)

# Launch machinery
jobs: list[VoroMQA] = [voromqa]
# Run Job(s)
self.log("Running Voro-mqa scoring")
Engine = get_engine(self.params['mode'], self.params)
engine = Engine(jobs)
engine.run()
self.log("Voro-mqa scoring finished!")

# Update score of output models
try:
self.output_models = update_models_with_scores(
output_fname,
models_to_score,
metric=self.params["metric"],
)
except ValueError as e:
self.finish_with_error(e)

# Write output file
scoring_tsv_fpath = f"{RECIPE_PATH.name}.tsv"
self.output(
scoring_tsv_fpath,
header_comments=f"# Note that negative of the value are reported in the case of non-energetical predictions{linesep}", # noqa : E501
)
# Export to next module
self.export_io_models()
77 changes: 77 additions & 0 deletions src/haddock/modules/scoring/voroscoring/defaults.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
metric:
default: jury_score
type: string
choices:
- jury_score
- GNN_sum_score
- GNN_pcadscore
- voromqa_dark
- voromqa_light
- voromqa_energy
- gen_voromqa_energy
- clash_score
- area
minchars: 1
maxchars: 50
title: VoroMQA metric used to score.
short: VoroMQA metric used to score.
long: VoroMQA metric used to score.
group: analysis
explevel: easy

conda_install_dir:
default: "/trinity/login/vreys/miniconda3/"
type: string
minchars: 1
maxchars: 158
title: Path to conda intall directory.
short: Absolute path to conda intall directory.
long: Absolute path to conda intall directory.
group: execution
explevel: easy

conda_env_name:
default: "ftdmp5"
type: string
minchars: 1
maxchars: 100
title: Name of the ftdmp conda env.
short: Name of the ftdmp conda env.
long: Name of the ftdmp conda env.
group: execution
explevel: easy

ftdmp_install_dir:
default: "/trinity/login/vreys/Venclovas/ftdmp/"
type: string
minchars: 1
maxchars: 158
title: Path to ftdmp intall directory.
short: Absolute path to ftdmp intall directory.
long: Absolute path to ftdmp intall directory.
group: execution
explevel: easy

nb_gpus:
default: 1
type: integer
min: 1
max: 420
title: Number of accessible gpu on the device.
short: Number of accessible gpu on the device.
long: Number of accessible gpu on the device.
group: execution
explevel: easy

concat_chain_:
default: []
type: list
minitems: 0
maxitems: 100
title: List of residues supposed to be buried
short: List of residues supposed to be buried
long: concat_chain_* is an expandable parameter. You can provide concat_chain_1,
concat_chain_2, concat_chain_3, etc. For each selection, enlisted chains will
be concatenated as one prior to scoring.
group: analysis
explevel: expert
Loading
Loading