Unimol Docking V2 strange result on posebuster set #261

simmed00 · 2024-08-28T05:28:51Z

I followed the posebuster.ipynb provided in your repo, downloaded the weight file and downloaded the eval_set zip files. However, since the eval_set does not contain pdb file, so I downloaded from the Posebuster repo and move the pdb file into the eval_set folder.
I ran everything smoothly, and the notebook end up giving me a RMSD<2A passing rate of something near 0.8.

However, when I export the predicted pose, and run the Posebuster quality check again, the passing rate significantly drops to lower than 0.6, and the passing rate with PB-valid drop below 0.2. I checked some of the prediction, in some part of the molecule, atoms clash together, which is quite surprising.

I wonder if there is any difference in the passing criteria in the Posebuster repo and the passing criteria written in your ipynb? And if there is details I missed in running the ipynb provided in your github?

hypnopump · 2024-08-28T05:56:43Z

Hi @simmed00 , thanks for your interest in UniMol Docking!
Could you provide examples of the clashing atoms you mention? And can you describe more accurately the process you follow to calculate the rmsd? It would be great if your issue can be reproduced to better understand the discrepancy.

simmed00 · 2024-08-28T09:24:16Z

basically I followed the Posebuster way like below: true_file is the GT pose, cond_file is the protein, test_file is the prediction from UniMol
5SAK_predict.zip
I attached one of the strange output.
true_file = root + '/' + subject + '/' + subject + '_ligand.sdf'
cond_file = root + '/' + subject + '/' + subject + '_protein.pdb'
test_file = unidock_root + '/' + subject_short + '_predict.sdf'
buster = PoseBusters(config="redock")
try:
df = buster.bust([test_file], true_file, cond_file, full_report=True)
print(df)
df.to_csv(root + '/' + subject + '/unidock' + '.csv')
except Exception as e:
print(subject, e)

simmed00 · 2024-08-28T09:28:21Z

Thank you for your prompt reply. I ran the posebuster_demo notebook, and the final result is: results length: 428
RMSD < 0.5 : 0.08878504672897196
RMSD < 1.0 : 0.4696261682242991
RMSD < 1.5 : 0.6985981308411215
RMSD < 2.0 : 0.7780373831775701
RMSD < 3.0 : 0.866822429906542
RMSD < 5.0 : 0.9299065420560748
avg RMSD : 1.716739050074667

results length: 428
RMSD < 0.5 : 0.08878504672897196
RMSD < 1.0 : 0.4766355140186916
RMSD < 1.5 : 0.7032710280373832
RMSD < 2.0 : 0.7850467289719626
RMSD < 3.0 : 0.8738317757009346
RMSD < 5.0 : 0.9369158878504673
avg RMSD : 1.6454183516713314.
So I assume I get it right since it is not very far from the result in your paper.

hypnopump · 2024-08-28T12:18:36Z

Hi again @simmed00 , your issue was reproduced. The output you shared had indeed some steric clashes, whereas the expected output should not.

Current behaviour of posebuster_demo.ipynb notebook:
Expected behaviour:
Reason: It seems the -steric-clash-fix command mentioned in the README.md was not passed by default in the notebook demo.

This has been fixed in the main branch (please update your code to use it!). Please feel free to close the issue

simmed00 · 2024-08-28T14:41:19Z

Thanks for the fix. I added back the -steric-clash-fix command to the inference call, and the results after running the notebook change a little bit, it looks a bit lower than that in the paper, shown as below:

results length: 428
RMSD < 0.5 : 0.05841121495327103
RMSD < 1.0 : 0.28738317757009346
RMSD < 1.5 : 0.5560747663551402
RMSD < 2.0 : 0.7126168224299065
RMSD < 3.0 : 0.8387850467289719
RMSD < 5.0 : 0.9065420560747663
avg RMSD : 2.116286272244654

results length: 428
RMSD < 0.5 : 0.08878504672897196
RMSD < 1.0 : 0.3621495327102804
RMSD < 1.5 : 0.6261682242990654
RMSD < 2.0 : 0.7476635514018691
RMSD < 3.0 : 0.8457943925233645
RMSD < 5.0 : 0.9112149532710281
avg RMSD : 1.976056631357393

I further downloaded the predicted pose and check. The earlier example of 5SAK is now fixed. I then used the same code in Posebuster to test it. This time, quite often the warning "Can't kekulize mol" appears (100+). The % passing the <2A criteria is 194/428=45%, and the % passing the <2A and PB-valid is only 160/428=37%. There is still some gap between the reported result. Is there any other details I missed when running the docking?

hypnopump mentioned this issue Aug 28, 2024

Notebook cmd missing arg #264

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unimol Docking V2 strange result on posebuster set #261

Unimol Docking V2 strange result on posebuster set #261

simmed00 commented Aug 28, 2024

hypnopump commented Aug 28, 2024 •

edited

Loading

simmed00 commented Aug 28, 2024

simmed00 commented Aug 28, 2024

hypnopump commented Aug 28, 2024 •

edited

Loading

simmed00 commented Aug 28, 2024

Unimol Docking V2 strange result on posebuster set #261

Unimol Docking V2 strange result on posebuster set #261

Comments

simmed00 commented Aug 28, 2024

hypnopump commented Aug 28, 2024 • edited Loading

simmed00 commented Aug 28, 2024

simmed00 commented Aug 28, 2024

hypnopump commented Aug 28, 2024 • edited Loading

simmed00 commented Aug 28, 2024

hypnopump commented Aug 28, 2024 •

edited

Loading

hypnopump commented Aug 28, 2024 •

edited

Loading