Question Regarding Data Preprocessing for train.lmdb and valid.lmdb from MOAD in docking_v2 Directory #259

iceissey · 2024-08-15T13:48:56Z

I would like to know how the train.lmdb and valid.lmdb files in the docking_v2/protein_ligand_binding_pose_prediction_v2 directory were processed from the MOAD dataset. I have checked the code and found only the data preprocessing during inference, which generates conformations for ligand molecules. Could you please clarify if the preprocessing during training is the same as during inference?

ZhouGengmo · 2024-08-28T08:56:35Z

The data preprocessing during training is consistent with that during inference, except that the conformation clustering is fixed during training, with M = 100 and N = 10.

Additionally, there are some duplicate data in the original training data and we do not apply special handling for this. We will release the processed LMDB files later.

iceissey changed the title ~~Question Regarding Data Preprocessing for train.lmdb and test.lmdb from MOAD in docking_v2 Directory~~ Question Regarding Data Preprocessing for train.lmdb and valid.lmdb from MOAD in docking_v2 Directory Aug 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question Regarding Data Preprocessing for train.lmdb and valid.lmdb from MOAD in docking_v2 Directory #259

Question Regarding Data Preprocessing for train.lmdb and valid.lmdb from MOAD in docking_v2 Directory #259

iceissey commented Aug 15, 2024 •

edited

Loading

ZhouGengmo commented Aug 28, 2024

Question Regarding Data Preprocessing for train.lmdb and valid.lmdb from MOAD in docking_v2 Directory #259

Question Regarding Data Preprocessing for train.lmdb and valid.lmdb from MOAD in docking_v2 Directory #259

Comments

iceissey commented Aug 15, 2024 • edited Loading

ZhouGengmo commented Aug 28, 2024

iceissey commented Aug 15, 2024 •

edited

Loading