Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrating Uni-Mol2 with unimol_tools #285

Open
YikunHan42 opened this issue Oct 28, 2024 · 3 comments
Open

Integrating Uni-Mol2 with unimol_tools #285

YikunHan42 opened this issue Oct 28, 2024 · 3 comments

Comments

@YikunHan42
Copy link

Description:
I’m working to integrate Uni-Mol2 into the unimol_tools repository and encountered some issues with configuration compatibility. I tried modifying the architecture and model configuration, as well as adjusting mol.dict to mol.dict_new.txt, but I’m not achieving the expected performance.

Here’s a summary of the integration steps I took:

  1. Modified Architecture in unimol.py

    • Substituted molecule_architecture parameters to match Uni-Mol2 specifications.
    • Adjustments include reducing encoder_layers to 12, encoder_embed_dim to 768, and setting other parameters as per Uni-Mol2-84M specifications.
  2. Model Configuration

    • Updated MODEL_CONFIG to point to the Uni-Mol2 checkpoint (checkpoint-84M.pt).
    • Adjusted path in weights and confirmed the correct .pt file location.
  3. Dictionary Adjustment

    • Created a new dictionary file, mol.dict_new.txt, with 128 rows instead of the original 31, to match Uni-Mol2’s expected 128x128 dimension.

Observed Issue:
The model is not performing as anticipated, and the integration does not seem fully compatible with Uni-Mol2’s configuration. Specifically:

  • The expected behavior and accuracy are not achieved despite matching configurations to Uni-Mol2 specs.
  • Potential misalignment in the dictionary file or model configuration could be causing the issue.

Questions:

  1. Could you identify any potential missteps or overlooked configuration requirements to ensure seamless integration?
  2. Are there plans to add official support for Uni-Mol2 within the unimol_tools repository? This would be highly beneficial to ensure compatibility and streamline the setup process.

Steps to Reproduce:

  1. Modify unimol.py as per the steps above.
  2. Adjust MODEL_CONFIG to point to the checkpoint-84M.pt.
  3. Replace mol.dict.txt with mol.dict_new.txt containing 128 rows.
  4. Run the model and observe deviations from expected performance.

Thank you for your assistance and for considering future Uni-Mol2 support in unimol_tools!
Image

@pansanity666
Copy link

I think the data preprocessing steps are different for Unimol2. So directly modifying the config is not OK.

@Naplessss
Copy link
Collaborator

yes, unimol2 involves a different data preprocessing approach compared to unimol, and we plan to integrate unimol2 into unimol_tools in an upcoming version. however, I'm not certain whether this will be in a few weeks or months.
also welcome you to contribute this features.

@dalessioluca
Copy link

yes, unimol2 involves a different data preprocessing approach compared to unimol, and we plan to integrate unimol2 into unimol_tools in an upcoming version. however, I'm not certain whether this will be in a few weeks or months. also welcome you to contribute this features.

would be amazing if unimol2 was integrated with unimol_tools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants