Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

haddock3-analyse keyerror #977

Closed
sverhoeven opened this issue Aug 23, 2024 · 8 comments · Fixed by #978
Closed

haddock3-analyse keyerror #977

sverhoeven opened this issue Aug 23, 2024 · 8 comments · Fixed by #978
Assignees
Labels
bug Something isn't working

Comments

@sverhoeven
Copy link
Contributor

During haddock3-analyse of protein-protein example a caprieval analysis failed.

Module/Workflow/Library affected

haddock3-analyse command

Expected behavior

That analysis/11_caprieval_analysis/report.html exists.

Actual behavior

The last caprieval analysis failed with

[2024-08-22 13:33:50,836 cli_analyse WARNING] Could not execute the analysis for step 11_caprieval.
                The following error occurred 1

with

haddock3-analyse --is_cleaned true -r . -m 11

and log.exception(e) added I got

[2024-08-23 09:40:08,129 cli_analyse ERROR] 1
Traceback (most recent call last):
  File "/home/stefanv/git/ivresse/haddock3/src/haddock/clis/cli_analyse.py", line 599, in main
    analyse_step(
  File "/home/stefanv/git/ivresse/haddock3/src/haddock/clis/cli_analyse.py", line 478, in analyse_step
    scatters = scatter_plot_handler(
  File "/home/stefanv/git/ivresse/haddock3/src/haddock/libs/libplots.py", line 630, in scatter_plot_handler
    fig = scatter_plot_plotly(
  File "/home/stefanv/git/ivresse/haddock3/src/haddock/libs/libplots.py", line 458, in scatter_plot_plotly
    cl_df = gb_cluster.get_group(cl_id)
  File "/home/stefanv/git/ivresse/haddock3/venv/lib/python3.10/site-packages/pandas/core/groupby/groupby.py", line 811, in get_group
    raise KeyError(name)
KeyError: 1

Steps to reproduce the behavior

I ran the docking-protein-protein-full.cfg workflow with following overwrites

mode = 'local'
ncores = 14
postprocess = true
clean = true
offline = false
less_io = true
cd examples/docking-protein-protein
haddock3 docking-protein-protein-full.like-webapp.cfg
...
[2024-08-23 10:19:48,348 cli_analyse WARNING] Could not execute the analysis for step 11_caprieval.
                The following error occurred 1
...

See complete config and log at bottom of this issue.

Suggestions on how to fix it

Minor improvement would be to use log.exception(e) instead of log.warning(f'{e}') to get the full stack trace. Getting error 1 is not very helpful.

Version

7a9434e

Additional context

docking-protein-protein-full.like-webapp.cfg
# ====================================================================
# Protein-protein docking example with NMR-derived ambiguous interaction restraints

# directory in which the scoring will be done
run_dir = "run1-full-like-webapp"

mode = 'local'
ncores = 14
postprocess = true
clean = true
offline = false
less_io = true

# molecules to be docked
molecules =  [
    "data/e2aP_1F3G.pdb",
    "data/hpr_ensemble.pdb"
    ]

# ====================================================================
# Parameters for each stage are defined below, prefer full paths
# ====================================================================
[topoaa]
autohis = false
[topoaa.mol1]
nhisd = 0
nhise = 1
hise_1 = 75
[topoaa.mol2]
nhisd = 1
hisd_1 = 76
nhise = 1
hise_1 = 15

[rigidbody]
tolerance = 5
ambig_fname = "data/e2a-hpr_air.tbl"
sampling = 1000

[caprieval]
reference_fname = "data/e2a-hpr_1GGR.pdb"

[seletop]
select = 200

[caprieval]
reference_fname = "data/e2a-hpr_1GGR.pdb"

[flexref]
tolerance = 5
ambig_fname = "data/e2a-hpr_air.tbl"

[caprieval]
reference_fname = "data/e2a-hpr_1GGR.pdb"

[emref]
tolerance = 5
ambig_fname = "data/e2a-hpr_air.tbl"

[caprieval]
reference_fname = "data/e2a-hpr_1GGR.pdb"

[clustfcc]

[seletopclusts]
top_models = 4

[caprieval]
reference_fname = "data/e2a-hpr_1GGR.pdb"

# ====================================================================

log
[2024-08-23 09:53:39,862 cli INFO] 
##############################################
#                                            #
#                 HADDOCK 3                  #
#                                            #
##############################################

Starting HADDOCK 3.0.0 on 2024-08-23 09:53:00

Python 3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0]

[2024-08-23 09:53:41,994 libworkflow INFO] Reading instructions step 0_topoaa
[2024-08-23 09:53:41,994 libworkflow INFO] Reading instructions step 1_rigidbody
[2024-08-23 09:53:41,995 libworkflow INFO] Reading instructions step 2_caprieval
[2024-08-23 09:53:41,995 libworkflow INFO] Reading instructions step 3_seletop
[2024-08-23 09:53:41,995 libworkflow INFO] Reading instructions step 4_caprieval
[2024-08-23 09:53:41,995 libworkflow INFO] Reading instructions step 5_flexref
[2024-08-23 09:53:41,995 libworkflow INFO] Reading instructions step 6_caprieval
[2024-08-23 09:53:41,995 libworkflow INFO] Reading instructions step 7_emref
[2024-08-23 09:53:41,995 libworkflow INFO] Reading instructions step 8_caprieval
[2024-08-23 09:53:41,996 libworkflow INFO] Reading instructions step 9_clustfcc
[2024-08-23 09:53:41,996 libworkflow INFO] Reading instructions step 10_seletopclusts
[2024-08-23 09:53:41,996 libworkflow INFO] Reading instructions step 11_caprieval
[2024-08-23 09:53:42,011 base_cns_module INFO] Running [topoaa] module
[2024-08-23 09:53:42,011 __init__ INFO] [topoaa] Molecule 1: e2aP_1F3G.pdb
[2024-08-23 09:53:42,012 __init__ INFO] [topoaa] Sanitizing molecule e2aP_1F3G.pdb
[2024-08-23 09:53:42,017 __init__ INFO] [topoaa] Topology CNS input created
[2024-08-23 09:53:42,017 __init__ INFO] [topoaa] Molecule 2: hpr_ensemble.pdb
[2024-08-23 09:53:42,022 __init__ INFO] [topoaa] Sanitizing molecule hpr_ensemble_1.pdb
[2024-08-23 09:53:42,025 __init__ INFO] [topoaa] Topology CNS input created
[2024-08-23 09:53:42,025 __init__ INFO] [topoaa] Sanitizing molecule hpr_ensemble_2.pdb
[2024-08-23 09:53:42,028 __init__ INFO] [topoaa] Topology CNS input created
[2024-08-23 09:53:42,028 __init__ INFO] [topoaa] Sanitizing molecule hpr_ensemble_3.pdb
[2024-08-23 09:53:42,031 __init__ INFO] [topoaa] Topology CNS input created
[2024-08-23 09:53:42,031 __init__ INFO] [topoaa] Sanitizing molecule hpr_ensemble_4.pdb
[2024-08-23 09:53:42,034 __init__ INFO] [topoaa] Topology CNS input created
[2024-08-23 09:53:42,034 __init__ INFO] [topoaa] Sanitizing molecule hpr_ensemble_5.pdb
[2024-08-23 09:53:42,037 __init__ INFO] [topoaa] Topology CNS input created
[2024-08-23 09:53:42,037 __init__ INFO] [topoaa] Sanitizing molecule hpr_ensemble_6.pdb
[2024-08-23 09:53:42,040 __init__ INFO] [topoaa] Topology CNS input created
[2024-08-23 09:53:42,040 __init__ INFO] [topoaa] Sanitizing molecule hpr_ensemble_7.pdb
[2024-08-23 09:53:42,042 __init__ INFO] [topoaa] Topology CNS input created
[2024-08-23 09:53:42,043 __init__ INFO] [topoaa] Sanitizing molecule hpr_ensemble_8.pdb
[2024-08-23 09:53:42,045 __init__ INFO] [topoaa] Topology CNS input created
[2024-08-23 09:53:42,045 __init__ INFO] [topoaa] Sanitizing molecule hpr_ensemble_9.pdb
[2024-08-23 09:53:42,048 __init__ INFO] [topoaa] Topology CNS input created
[2024-08-23 09:53:42,048 __init__ INFO] [topoaa] Sanitizing molecule hpr_ensemble_10.pdb
[2024-08-23 09:53:42,051 __init__ INFO] [topoaa] Topology CNS input created
[2024-08-23 09:53:42,051 __init__ INFO] [topoaa] Running CNS Jobs n=11
[2024-08-23 09:53:42,051 libutil INFO] Selected 11 cores to process 11 jobs, with 20 maximum available cores.
[2024-08-23 09:53:42,053 libparallel INFO] Using 11 cores
[2024-08-23 09:53:42,942 libparallel INFO] 11 tasks finished
[2024-08-23 09:53:42,942 __init__ INFO] [topoaa] CNS jobs have finished
[2024-08-23 09:53:42,944 base_cns_module INFO] Module [topoaa] finished.
[2024-08-23 09:53:42,944 __init__ INFO] [topoaa] took 1 seconds
[2024-08-23 09:53:43,350 base_cns_module INFO] Running [rigidbody] module
[2024-08-23 09:53:43,351 __init__ INFO] [rigidbody] crossdock=true
[2024-08-23 09:53:43,351 __init__ INFO] [rigidbody] Preparing jobs...
[2024-08-23 09:53:43,352 libutil INFO] Selected 14 cores to process 1000 jobs, with 20 maximum available cores.
[2024-08-23 09:53:43,353 libparallel INFO] Using 14 cores
[2024-08-23 09:53:43,601 libparallel INFO] 1000 tasks finished
[2024-08-23 09:53:43,602 __init__ INFO] [rigidbody] Preparation took 0.250582 seconds
[2024-08-23 09:53:43,642 __init__ INFO] [rigidbody] Running CNS Jobs n=1000
[2024-08-23 09:53:43,643 libutil INFO] Selected 14 cores to process 1000 jobs, with 20 maximum available cores.
[2024-08-23 09:53:43,643 libparallel INFO] Using 14 cores
[2024-08-23 10:02:22,864 libparallel INFO] 1000 tasks finished
[2024-08-23 10:02:22,865 __init__ INFO] [rigidbody] CNS jobs have finished
[2024-08-23 10:02:23,219 base_cns_module INFO] Module [rigidbody] finished.
[2024-08-23 10:02:23,219 __init__ INFO] [rigidbody] took 8 minutes and 40 seconds
[2024-08-23 10:02:23,302 __init__ INFO] Running [caprieval] module
[2024-08-23 10:02:23,302 capri INFO] Found previous CNS step: 01_rigidbody
[2024-08-23 10:02:23,445 capri INFO] Saved scoring weights to: weights_params.json
[2024-08-23 10:02:25,960 libutil INFO] Selected 14 cores to process 1000 jobs, with 20 maximum available cores.
[2024-08-23 10:02:25,961 libparallel INFO] Using 14 cores
[2024-08-23 10:02:37,155 libparallel INFO] 1000 tasks finished
[2024-08-23 10:02:37,169 capri INFO] Rearranging cluster information into capri_clt.tsv
[2024-08-23 10:02:37,355 __init__ INFO] Module [caprieval] finished.
[2024-08-23 10:02:37,355 __init__ INFO] [caprieval] took 14 seconds
[2024-08-23 10:02:37,420 __init__ INFO] Running [seletop] module
[2024-08-23 10:02:37,534 __init__ INFO] Module [seletop] finished.
[2024-08-23 10:02:37,535 __init__ INFO] [seletop] took 0 seconds
[2024-08-23 10:02:37,625 __init__ INFO] Running [caprieval] module
[2024-08-23 10:02:37,625 capri INFO] Found previous CNS step: 01_rigidbody
[2024-08-23 10:02:37,767 capri INFO] Saved scoring weights to: weights_params.json
[2024-08-23 10:02:38,273 libutil INFO] Selected 14 cores to process 200 jobs, with 20 maximum available cores.
[2024-08-23 10:02:38,273 libparallel INFO] Using 14 cores
[2024-08-23 10:02:40,617 libparallel INFO] 200 tasks finished
[2024-08-23 10:02:40,620 capri INFO] Rearranging cluster information into capri_clt.tsv
[2024-08-23 10:02:40,652 __init__ INFO] Module [caprieval] finished.
[2024-08-23 10:02:40,652 __init__ INFO] [caprieval] took 3 seconds
[2024-08-23 10:02:41,325 base_cns_module INFO] Running [flexref] module
[2024-08-23 10:02:41,796 __init__ INFO] [flexref] Running CNS Jobs n=200
[2024-08-23 10:02:41,796 libutil INFO] Selected 14 cores to process 200 jobs, with 20 maximum available cores.
[2024-08-23 10:02:41,796 libparallel INFO] Using 14 cores
[2024-08-23 10:18:16,421 libparallel INFO] 200 tasks finished
[2024-08-23 10:18:16,421 __init__ INFO] [flexref] CNS jobs have finished
[2024-08-23 10:18:16,513 base_cns_module INFO] Module [flexref] finished.
[2024-08-23 10:18:16,513 __init__ INFO] [flexref] took 15 minutes and 35 seconds
[2024-08-23 10:18:16,553 __init__ INFO] Running [caprieval] module
[2024-08-23 10:18:16,553 capri INFO] Found previous CNS step: 05_flexref
[2024-08-23 10:18:16,880 capri INFO] Saved scoring weights to: weights_params.json
[2024-08-23 10:18:17,359 libutil INFO] Selected 14 cores to process 200 jobs, with 20 maximum available cores.
[2024-08-23 10:18:17,360 libparallel INFO] Using 14 cores
[2024-08-23 10:18:19,624 libparallel INFO] 200 tasks finished
[2024-08-23 10:18:19,627 capri INFO] Rearranging cluster information into capri_clt.tsv
[2024-08-23 10:18:19,658 __init__ INFO] Module [caprieval] finished.
[2024-08-23 10:18:19,659 __init__ INFO] [caprieval] took 3 seconds
[2024-08-23 10:18:20,233 base_cns_module INFO] Running [emref] module
[2024-08-23 10:18:20,660 __init__ INFO] [emref] Running CNS Jobs n=200
[2024-08-23 10:18:20,660 libutil INFO] Selected 14 cores to process 200 jobs, with 20 maximum available cores.
[2024-08-23 10:18:20,661 libparallel INFO] Using 14 cores
[2024-08-23 10:19:38,381 libparallel INFO] 200 tasks finished
[2024-08-23 10:19:38,381 __init__ INFO] [emref] CNS jobs have finished
[2024-08-23 10:19:38,472 base_cns_module INFO] Module [emref] finished.
[2024-08-23 10:19:38,472 __init__ INFO] [emref] took 1 minute and 18 seconds
[2024-08-23 10:19:38,511 __init__ INFO] Running [caprieval] module
[2024-08-23 10:19:38,511 capri INFO] Found previous CNS step: 07_emref
[2024-08-23 10:19:38,669 capri INFO] Saved scoring weights to: weights_params.json
[2024-08-23 10:19:39,151 libutil INFO] Selected 14 cores to process 200 jobs, with 20 maximum available cores.
[2024-08-23 10:19:39,151 libparallel INFO] Using 14 cores
[2024-08-23 10:19:41,644 libparallel INFO] 200 tasks finished
[2024-08-23 10:19:41,648 capri INFO] Rearranging cluster information into capri_clt.tsv
[2024-08-23 10:19:41,684 __init__ INFO] Module [caprieval] finished.
[2024-08-23 10:19:41,684 __init__ INFO] [caprieval] took 3 seconds
[2024-08-23 10:19:41,725 __init__ INFO] Running [clustfcc] module
[2024-08-23 10:19:41,726 __init__ INFO] Calculating contacts
[2024-08-23 10:19:41,727 libutil INFO] Selected 14 cores to process 200 jobs, with 20 maximum available cores.
[2024-08-23 10:19:41,727 libparallel INFO] Using 14 cores
[2024-08-23 10:19:41,921 libparallel INFO] 200 tasks finished
[2024-08-23 10:19:41,922 __init__ INFO] Calculating the FCC matrix
[2024-08-23 10:19:41,983 __init__ INFO] Clustering...
[2024-08-23 10:19:41,993 clustfcc INFO] Clustering with min_population=4
[2024-08-23 10:19:41,994 clustfcc INFO] Saving output to cluster.out
[2024-08-23 10:19:41,996 libclust INFO] Saving structure list to clustfcc.tsv
[2024-08-23 10:19:41,996 clustfcc INFO] Saving detailed output to clustfcc.txt
[2024-08-23 10:19:42,027 __init__ INFO] Module [clustfcc] finished.
[2024-08-23 10:19:42,027 __init__ INFO] [clustfcc] took 0 seconds
[2024-08-23 10:19:42,045 __init__ INFO] Running [seletopclusts] module
[2024-08-23 10:19:42,045 __init__ INFO] Selecting all clusters: 1,2,3,4,5,6
[2024-08-23 10:19:42,045 __init__ INFO]  emref_11.pdb > cluster_1_model_1.pdb
[2024-08-23 10:19:42,045 __init__ INFO]  emref_4.pdb > cluster_1_model_2.pdb
[2024-08-23 10:19:42,045 __init__ INFO]  emref_3.pdb > cluster_1_model_3.pdb
[2024-08-23 10:19:42,045 __init__ INFO]  emref_72.pdb > cluster_1_model_4.pdb
[2024-08-23 10:19:42,045 __init__ INFO]  emref_40.pdb > cluster_2_model_1.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_42.pdb > cluster_2_model_2.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_174.pdb > cluster_2_model_3.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_39.pdb > cluster_2_model_4.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_34.pdb > cluster_3_model_1.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_25.pdb > cluster_3_model_2.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_16.pdb > cluster_3_model_3.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_177.pdb > cluster_3_model_4.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_29.pdb > cluster_4_model_1.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_65.pdb > cluster_4_model_2.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_63.pdb > cluster_4_model_3.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_95.pdb > cluster_4_model_4.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_31.pdb > cluster_5_model_1.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_43.pdb > cluster_5_model_2.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_96.pdb > cluster_5_model_3.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_8.pdb > cluster_5_model_4.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_93.pdb > cluster_6_model_1.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_195.pdb > cluster_6_model_2.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_186.pdb > cluster_6_model_3.pdb
[2024-08-23 10:19:42,046 __init__ INFO]  emref_179.pdb > cluster_6_model_4.pdb
[2024-08-23 10:19:42,069 __init__ INFO] Module [seletopclusts] finished.
[2024-08-23 10:19:42,069 __init__ INFO] [seletopclusts] took 0 seconds
[2024-08-23 10:19:42,098 __init__ INFO] Running [caprieval] module
[2024-08-23 10:19:42,099 capri INFO] Found previous CNS step: 07_emref
[2024-08-23 10:19:42,246 capri INFO] Saved scoring weights to: weights_params.json
[2024-08-23 10:19:42,307 libutil INFO] Selected 14 cores to process 24 jobs, with 20 maximum available cores.
[2024-08-23 10:19:42,308 libparallel INFO] Using 14 cores
[2024-08-23 10:19:42,685 libparallel INFO] 24 tasks finished
[2024-08-23 10:19:42,686 capri INFO] Rearranging cluster information into capri_clt.tsv
[2024-08-23 10:19:42,696 __init__ INFO] Module [caprieval] finished.
[2024-08-23 10:19:42,696 __init__ INFO] [caprieval] took 1 seconds
[2024-08-23 10:19:42,696 cli_analyse INFO] Running haddock3-analyse on ./, modules [2, 4, 6, 8, 11], with top_cluster = 10
[2024-08-23 10:19:42,709 cli_analyse INFO] Created directory: /home/stefanv/git/ivresse/haddock3/examples/docking-protein-protein/run1-full-like-webapp/analysis
[2024-08-23 10:19:42,709 cli_analyse INFO] Reading input run directory
[2024-08-23 10:19:42,709 cli_analyse INFO] selected steps: 02_caprieval, 04_caprieval, 06_caprieval, 08_caprieval, 11_caprieval
[2024-08-23 10:19:42,710 cli_analyse INFO] Analysing step 02_caprieval
[2024-08-23 10:19:42,710 cli_analyse INFO] step 02_caprieval is caprieval, files should be already available
[2024-08-23 10:19:42,710 cli_analyse INFO] CAPRI files identified
[2024-08-23 10:19:42,729 cli_analyse INFO] Plotting results..
[2024-08-23 10:19:44,755 cli_analyse INFO] Summary archive summary.tgz created!
[2024-08-23 10:19:44,755 cli_analyse INFO] Analysing step 04_caprieval
[2024-08-23 10:19:44,756 cli_analyse INFO] step 04_caprieval is caprieval, files should be already available
[2024-08-23 10:19:44,756 cli_analyse INFO] CAPRI files identified
[2024-08-23 10:19:44,758 cli_analyse INFO] Plotting results..
[2024-08-23 10:19:45,960 cli_analyse INFO] Summary archive summary.tgz created!
[2024-08-23 10:19:45,960 cli_analyse INFO] Analysing step 06_caprieval
[2024-08-23 10:19:45,960 cli_analyse INFO] step 06_caprieval is caprieval, files should be already available
[2024-08-23 10:19:45,960 cli_analyse INFO] CAPRI files identified
[2024-08-23 10:19:45,962 cli_analyse INFO] Plotting results..
[2024-08-23 10:19:47,154 cli_analyse INFO] Summary archive summary.tgz created!
[2024-08-23 10:19:47,154 cli_analyse INFO] Analysing step 08_caprieval
[2024-08-23 10:19:47,155 cli_analyse INFO] step 08_caprieval is caprieval, files should be already available
[2024-08-23 10:19:47,155 cli_analyse INFO] CAPRI files identified
[2024-08-23 10:19:47,156 cli_analyse INFO] Plotting results..
[2024-08-23 10:19:48,343 cli_analyse INFO] Summary archive summary.tgz created!
[2024-08-23 10:19:48,343 cli_analyse INFO] Analysing step 11_caprieval
[2024-08-23 10:19:48,344 cli_analyse INFO] step 11_caprieval is caprieval, files should be already available
[2024-08-23 10:19:48,344 cli_analyse INFO] CAPRI files identified
[2024-08-23 10:19:48,345 cli_analyse INFO] Plotting results..
[2024-08-23 10:19:48,348 cli_analyse WARNING] Could not execute the analysis for step 11_caprieval.
                The following error occurred 1
[2024-08-23 10:19:48,348 cli_analyse ERROR] 1
Traceback (most recent call last):
  File "/home/stefanv/git/ivresse/haddock3/src/haddock/clis/cli_analyse.py", line 599, in main
    analyse_step(
  File "/home/stefanv/git/ivresse/haddock3/src/haddock/clis/cli_analyse.py", line 478, in analyse_step
    scatters = scatter_plot_handler(
  File "/home/stefanv/git/ivresse/haddock3/src/haddock/libs/libplots.py", line 630, in scatter_plot_handler
    fig = scatter_plot_plotly(
  File "/home/stefanv/git/ivresse/haddock3/src/haddock/libs/libplots.py", line 458, in scatter_plot_plotly
    cl_df = gb_cluster.get_group(cl_id)
  File "/home/stefanv/git/ivresse/haddock3/venv/lib/python3.10/site-packages/pandas/core/groupby/groupby.py", line 811, in get_group
    raise KeyError(name)
KeyError: 1
[2024-08-23 10:19:48,351 cli_analyse INFO] moving files to analysis folder
[2024-08-23 10:19:48,351 cli_analyse INFO] cancelling unsuccesful analysis folders
[2024-08-23 10:19:48,351 cli_analyse INFO] updating paths in analysis/02_caprieval_analysis/capri_ss.tsv
[2024-08-23 10:19:48,351 cli_analyse INFO] View the results in analysis/02_caprieval_analysis/report.html
[2024-08-23 10:19:48,351 cli_analyse INFO] To view structures or download the structure files, in a terminal run the command `python -m http.server --directory /home/stefanv/git/ivresse/haddock3/examples/docking-protein-protein/run1-full-like-webapp`. By default, http server runs on `http://0.0.0.0:8000/`. Open the link http://0.0.0.0:8000/analysis/02_caprieval_analysis/report.html in a web browser.
[2024-08-23 10:19:48,351 cli_analyse INFO] updating paths in analysis/04_caprieval_analysis/capri_ss.tsv
[2024-08-23 10:19:48,352 cli_analyse INFO] View the results in analysis/04_caprieval_analysis/report.html
[2024-08-23 10:19:48,352 cli_analyse INFO] To view structures or download the structure files, in a terminal run the command `python -m http.server --directory /home/stefanv/git/ivresse/haddock3/examples/docking-protein-protein/run1-full-like-webapp`. By default, http server runs on `http://0.0.0.0:8000/`. Open the link http://0.0.0.0:8000/analysis/04_caprieval_analysis/report.html in a web browser.
[2024-08-23 10:19:48,352 cli_analyse INFO] updating paths in analysis/06_caprieval_analysis/capri_ss.tsv
[2024-08-23 10:19:48,352 cli_analyse INFO] View the results in analysis/06_caprieval_analysis/report.html
[2024-08-23 10:19:48,352 cli_analyse INFO] To view structures or download the structure files, in a terminal run the command `python -m http.server --directory /home/stefanv/git/ivresse/haddock3/examples/docking-protein-protein/run1-full-like-webapp`. By default, http server runs on `http://0.0.0.0:8000/`. Open the link http://0.0.0.0:8000/analysis/06_caprieval_analysis/report.html in a web browser.
[2024-08-23 10:19:48,352 cli_analyse INFO] updating paths in analysis/08_caprieval_analysis/capri_ss.tsv
[2024-08-23 10:19:48,352 cli_analyse INFO] View the results in analysis/08_caprieval_analysis/report.html
[2024-08-23 10:19:48,352 cli_analyse INFO] To view structures or download the structure files, in a terminal run the command `python -m http.server --directory /home/stefanv/git/ivresse/haddock3/examples/docking-protein-protein/run1-full-like-webapp`. By default, http server runs on `http://0.0.0.0:8000/`. Open the link http://0.0.0.0:8000/analysis/08_caprieval_analysis/report.html in a web browser.
[2024-08-23 10:19:48,352 cli_traceback INFO] Running haddock3-traceback on ./
[2024-08-23 10:19:48,352 cli_traceback INFO] Reading input run directory
[2024-08-23 10:19:48,352 cli_traceback INFO] All_steps: 00_topoaa, 01_rigidbody, 02_caprieval, 03_seletop, 04_caprieval, 05_flexref, 06_caprieval, 07_emref, 08_caprieval, 09_clustfcc, 10_seletopclusts, 11_caprieval
[2024-08-23 10:19:48,354 cli_traceback INFO] Modules not to be analysed: 00_topoaa, 02_caprieval, 03_seletop, 04_caprieval, 06_caprieval, 08_caprieval, 09_clustfcc, 11_caprieval
[2024-08-23 10:19:48,354 cli_traceback INFO] Steps to trace back: 01_rigidbody, 05_flexref, 07_emref, 10_seletopclusts
[2024-08-23 10:19:48,355 cli_traceback INFO] Created directory: /home/stefanv/git/ivresse/haddock3/examples/docking-protein-protein/run1-full-like-webapp/traceback
[2024-08-23 10:19:48,355 cli_traceback INFO] Tracing back step 10_seletopclusts
[2024-08-23 10:19:48,370 cli_traceback INFO] Tracing back step 07_emref
[2024-08-23 10:19:48,396 cli_traceback INFO] Tracing back step 05_flexref
[2024-08-23 10:19:48,428 cli_traceback INFO] Tracing back step 01_rigidbody
[2024-08-23 10:19:48,536 cli_traceback INFO] Output dataframe traceback/traceback.tsv created with shape (1000, 10)
[2024-08-23 10:19:48,581 clean_steps INFO] Cleaning output for '00_topoaa' using 14 cores.
[2024-08-23 10:19:48,719 libtimer INFO] cleaning output files took 0 seconds
[2024-08-23 10:19:48,719 clean_steps INFO] Cleaning output for '01_rigidbody' using 14 cores.
[2024-08-23 10:19:51,085 libtimer INFO] cleaning output files took 2 seconds
[2024-08-23 10:19:51,085 clean_steps INFO] Cleaning output for '02_caprieval' using 14 cores.
[2024-08-23 10:19:51,103 libtimer INFO] cleaning output files took 0 seconds
[2024-08-23 10:19:51,103 clean_steps INFO] Cleaning output for '03_seletop' using 14 cores.
[2024-08-23 10:19:51,117 libtimer INFO] cleaning output files took 0 seconds
[2024-08-23 10:19:51,117 clean_steps INFO] Cleaning output for '04_caprieval' using 14 cores.
[2024-08-23 10:19:51,132 libtimer INFO] cleaning output files took 0 seconds
[2024-08-23 10:19:51,132 clean_steps INFO] Cleaning output for '05_flexref' using 14 cores.
[2024-08-23 10:19:51,685 libtimer INFO] cleaning output files took 1 seconds
[2024-08-23 10:19:51,685 clean_steps INFO] Cleaning output for '06_caprieval' using 14 cores.
[2024-08-23 10:19:51,701 libtimer INFO] cleaning output files took 0 seconds
[2024-08-23 10:19:51,701 clean_steps INFO] Cleaning output for '07_emref' using 14 cores.
[2024-08-23 10:19:52,262 libtimer INFO] cleaning output files took 1 seconds
[2024-08-23 10:19:52,262 clean_steps INFO] Cleaning output for '08_caprieval' using 14 cores.
[2024-08-23 10:19:52,277 libtimer INFO] cleaning output files took 0 seconds
[2024-08-23 10:19:52,277 clean_steps INFO] Cleaning output for '09_clustfcc' using 14 cores.
[2024-08-23 10:19:52,398 libtimer INFO] cleaning output files took 0 seconds
[2024-08-23 10:19:52,398 clean_steps INFO] Cleaning output for '10_seletopclusts' using 14 cores.
[2024-08-23 10:19:52,522 libtimer INFO] cleaning output files took 0 seconds
[2024-08-23 10:19:52,522 clean_steps INFO] Cleaning output for '11_caprieval' using 14 cores.
[2024-08-23 10:19:52,541 libtimer INFO] cleaning output files took 0 seconds
[2024-08-23 10:19:52,542 cli INFO] This HADDOCK3 run took: 26 minutes and 13 seconds
[2024-08-23 10:19:52,542 cli INFO] Finished at 23/08/2024 10:19:52. Au revoir! 再见! Agur!

@sverhoeven sverhoeven added the bug Something isn't working label Aug 23, 2024
@mgiulini mgiulini self-assigned this Aug 23, 2024
@VGPReys VGPReys self-assigned this Aug 23, 2024
@VGPReys
Copy link
Contributor

VGPReys commented Aug 23, 2024

⚠️ ! docking-protein-protein-test do not reproduce the error ! ⚠️

@VGPReys
Copy link
Contributor

VGPReys commented Aug 23, 2024

in capri_ss.tsv, cluster id == - for all structures !

@sverhoeven
Copy link
Contributor Author

in capri_ss.tsv, cluster id == - for all structures !

While capri_clt.csv has clusters

@mgiulini
Copy link
Contributor

nothing wrong in haddock3-analyse (except the log message that could indeed be improved), rather the problem comes from here

"cluster_id": None,
"cluster_ranking": None,
"model-cluster_ranking": None,
(introduced by PR #928 ).
The difference between less_io = false and less_io = true should not have two completely different execution schemes as that is quite prone to errors. Not to mention maintenance costs.

discusses already in #862

@rvhonorato
Copy link
Member

Indeed, all the logic should be moved away from the modules - see #970, I can handle this one @VGPReys @mgiulini

@rvhonorato rvhonorato assigned rvhonorato and unassigned mgiulini and VGPReys Aug 23, 2024
@mgiulini
Copy link
Contributor

Indeed, all the logic should be moved away from the modules - see #970, I can handle this one @VGPReys @mgiulini

how is that PR related to the code duplication that gave rise to this bug?

@rvhonorato
Copy link
Member

This here should not be two paths but rather only one using extract_data_from_capri_class, then we can remove a lot of code that is spliting tasks/merging files etc -

if less_io and isinstance(engine, Scheduler):
jobs = engine.results
extract_data_from_capri_class(
capri_objects=jobs,
output_fname=Path(".", "capri_ss.tsv"),
sort_key=self.params["sortby"],
sort_ascending=self.params["sort_ascending"],
)
else:
jobs = merge_data(jobs)
# Each job created one .tsv, unify them:
rearrange_ss_capri_output(
output_name="capri_ss.tsv",
output_count=len(jobs),
sort_key=self.params["sortby"],
sort_ascending=self.params["sort_ascending"],
path=Path("."),
)

Working on it at 977-haddock3-analyse-keyerror

@rvhonorato rvhonorato linked a pull request Aug 23, 2024 that will close this issue
12 tasks
@rvhonorato
Copy link
Member

Thanks for reporting it @sverhoeven, small oversight on my part since the caprieval module is taking information from previous modules directly from the PDBFile object instead of the ontology object (as it should).

That analysis/11_caprieval_analysis/report.html exists.

Confirmed fix with #978

$ ls -ltrh run1-full-like-webapp/analysis/11_caprieval_analysis/report.html
-rw-r--r-- 1 rodrigo users 335K Aug 23 13:59 run1-full-like-webapp/analysis/11_caprieval_analysis/report.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants