Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad performance when there is no RMSD information #66

Open
Dadiao-shuai opened this issue Dec 25, 2023 · 2 comments
Open

Bad performance when there is no RMSD information #66

Dadiao-shuai opened this issue Dec 25, 2023 · 2 comments

Comments

@Dadiao-shuai
Copy link

Dadiao-shuai commented Dec 25, 2023

Here, I try to contrast two experiments. normal training VS training without RMSD. I thought as long as the label and affinity label is given, the training wouldn't be different a lot. However, the RMSD-free training resulted in a bizarre performance:
image

I used the same args and gninatypes files to train model from crossdock_default2018.caffemodel using default2018.model(modified).
The rmsd columns in RMSD-free types are removed, and it's like:

0 3.906 pdb2019_refi_train_gninatypes/4u6w/4u6w_rec.gninatypes redock_default2018_pdbbind_v2019_docked_gninatypes/4u6w_docked_7.gninatypes
1 5.47 pdb2019_refi_train_gninatypes/1gi1/1gi1_rec.gninatypes pdb2019_refi_train_gninatypes/1gi1/1gi1_ligand.gninatypes

And this is the model data layer, I comment the top rmsd_true; In test I set has_rmsd false; In train I set balanced true, stratify_receptor false, has_rmsd false:

layer {
  name: "data"
  type: "MolGridData"
  top: "data"
  top: "label"
  top: "affinity"
  # top: "rmsd_true"
  include {
    phase: TEST
  }
  molgrid_data_param {
        source: "TESTFILE"
        batch_size: 50
        dimension: 23.5
        resolution: 0.500000
        shuffle: false
        ligmap: "completelig"
        recmap: "completerec"
        balanced: false
        has_affinity: true
        has_rmsd: false
        root_folder: "DATA_ROOT"
    }
  }
  
layer {
  name: "data"
  type: "MolGridData"
  top: "data"
  top: "label"
  top: "affinity"
  # top: "rmsd_true"
  include {
    phase: TRAIN
  }
  molgrid_data_param {
        source: "TRAINFILE"
        batch_size:  50
        dimension: 23.5
        resolution: 0.500000
        shuffle: true
        balanced: true
        jitter: 0.000000
        ligmap: "completelig"
        recmap: "completerec"        
        stratify_receptor: false
        stratify_affinity_min: 0
        stratify_affinity_max: 0
        stratify_affinity_step: 1.000000
        has_affinity: true
        has_rmsd: false
        random_rotation: true
        random_translate: 6
        root_folder: "DATA_ROOT"       
    }
}

And the rmsd layer is also deleted.

layer {
  name: "rmsd"
  type: "AffinityLoss"
  bottom: "affinity_output"
  bottom: "affinity"
  top: "rmsd"
...
@JonasLi-19
Copy link

I have thought about this question as well: How the gnina model only predict and train affinity without CNNscore binary_label.

Maybe the gnina/script/affinity could solve this task? But not sure if this is recommended.

@dkoes
Copy link
Contributor

dkoes commented Jan 10, 2024

It really looks like your prediction is going through a sigmoid (values range from 0 to 1), which an affinity prediction should do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants