Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unimol Docking V2 strange result on posebuster set #261

Open
simmed00 opened this issue Aug 28, 2024 · 5 comments
Open

Unimol Docking V2 strange result on posebuster set #261

simmed00 opened this issue Aug 28, 2024 · 5 comments

Comments

@simmed00
Copy link

I followed the posebuster.ipynb provided in your repo, downloaded the weight file and downloaded the eval_set zip files. However, since the eval_set does not contain pdb file, so I downloaded from the Posebuster repo and move the pdb file into the eval_set folder.
I ran everything smoothly, and the notebook end up giving me a RMSD<2A passing rate of something near 0.8.

However, when I export the predicted pose, and run the Posebuster quality check again, the passing rate significantly drops to lower than 0.6, and the passing rate with PB-valid drop below 0.2. I checked some of the prediction, in some part of the molecule, atoms clash together, which is quite surprising.

I wonder if there is any difference in the passing criteria in the Posebuster repo and the passing criteria written in your ipynb? And if there is details I missed in running the ipynb provided in your github?

@hypnopump
Copy link
Collaborator

hypnopump commented Aug 28, 2024

Hi @simmed00 , thanks for your interest in UniMol Docking!
Could you provide examples of the clashing atoms you mention? And can you describe more accurately the process you follow to calculate the rmsd? It would be great if your issue can be reproduced to better understand the discrepancy.

@simmed00
Copy link
Author

basically I followed the Posebuster way like below: true_file is the GT pose, cond_file is the protein, test_file is the prediction from UniMol
5SAK_predict.zip
I attached one of the strange output.
true_file = root + '/' + subject + '/' + subject + '_ligand.sdf'
cond_file = root + '/' + subject + '/' + subject + '_protein.pdb'
test_file = unidock_root + '/' + subject_short + '_predict.sdf'
buster = PoseBusters(config="redock")
try:
df = buster.bust([test_file], true_file, cond_file, full_report=True)
print(df)
df.to_csv(root + '/' + subject + '/unidock' + '.csv')
except Exception as e:
print(subject, e)

@simmed00
Copy link
Author

Thank you for your prompt reply. I ran the posebuster_demo notebook, and the final result is: results length: 428
RMSD < 0.5 : 0.08878504672897196
RMSD < 1.0 : 0.4696261682242991
RMSD < 1.5 : 0.6985981308411215
RMSD < 2.0 : 0.7780373831775701
RMSD < 3.0 : 0.866822429906542
RMSD < 5.0 : 0.9299065420560748
avg RMSD : 1.716739050074667


results length: 428
RMSD < 0.5 : 0.08878504672897196
RMSD < 1.0 : 0.4766355140186916
RMSD < 1.5 : 0.7032710280373832
RMSD < 2.0 : 0.7850467289719626
RMSD < 3.0 : 0.8738317757009346
RMSD < 5.0 : 0.9369158878504673
avg RMSD : 1.6454183516713314.
So I assume I get it right since it is not very far from the result in your paper.

@hypnopump
Copy link
Collaborator

hypnopump commented Aug 28, 2024

Hi again @simmed00 , your issue was reproduced. The output you shared had indeed some steric clashes, whereas the expected output should not.

  • Current behaviour of posebuster_demo.ipynb notebook:
    Captura de pantalla 2024-08-28 a las 14 05 34
  • Expected behaviour:
    Captura de pantalla 2024-08-28 a las 14 12 19
  • Reason: It seems the -steric-clash-fix command mentioned in the README.md was not passed by default in the notebook demo.

This has been fixed in the main branch (please update your code to use it!). Please feel free to close the issue

@simmed00
Copy link
Author

Thanks for the fix. I added back the -steric-clash-fix command to the inference call, and the results after running the notebook change a little bit, it looks a bit lower than that in the paper, shown as below:


results length: 428
RMSD < 0.5 : 0.05841121495327103
RMSD < 1.0 : 0.28738317757009346
RMSD < 1.5 : 0.5560747663551402
RMSD < 2.0 : 0.7126168224299065
RMSD < 3.0 : 0.8387850467289719
RMSD < 5.0 : 0.9065420560747663
avg RMSD : 2.116286272244654


results length: 428
RMSD < 0.5 : 0.08878504672897196
RMSD < 1.0 : 0.3621495327102804
RMSD < 1.5 : 0.6261682242990654
RMSD < 2.0 : 0.7476635514018691
RMSD < 3.0 : 0.8457943925233645
RMSD < 5.0 : 0.9112149532710281
avg RMSD : 1.976056631357393

I further downloaded the predicted pose and check. The earlier example of 5SAK is now fixed. I then used the same code in Posebuster to test it. This time, quite often the warning "Can't kekulize mol" appears (100+). The % passing the <2A criteria is 194/428=45%, and the % passing the <2A and PB-valid is only 160/428=37%. There is still some gap between the reported result. Is there any other details I missed when running the docking?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants