beta_lactam_demo / README.md
bcadkins01's picture
Update README.md
6fa0d6d verified
|
raw
history blame
3.23 kB
metadata
license: mit
title: Beta-lactam Generator
sdk: streamlit
emoji: πŸ“š
colorFrom: blue
colorTo: gray
short_description: App to generate and view beta-lactam structures

Beta-Lactam Molecule Generator and Viewer

Overview

This application demonstrates a drug discovery pipeline that allows users to:

  • Generate novel beta-lactam molecules using a generative AI model that was fine-tuned with beta-lactam structures.
  • View the generated molecules with SMILES and SAFE strings.
  • Predict select ADMET properties for the generated molecules using ADMET-AI.

Features

  • Molecule Generation:
    • Generates up to 3 beta-lactam molecules at a time.
    • Users can adjust the creativity (temperature) of the generation process. Higher number leads to more diverse output.
    • Generated molecules are named 'Mol01' to 'Mol03'.
  • Molecule Viewing:
    • Displays molecule structures using Streamlit.
    • Option to view molecules as SMILES or SAFE encodings.
  • ADMET Property Prediction:
    • Integrates ADMET-AI to predict select properties.
    • Displays predicted properties of each molecule.

How to Use the App

  1. Set Generation Parameters:
    • Use the sidebar to adjust the creativity (temperature) slider.
    • Select the number of molecules to generate (maximum of 3).
  2. Generate Molecules:
    • Click the 'Generate Molecules' button.
    • Generated molecules will appear with their structures, strings, and predicted ADMET properties.

Technical Details

  • Generative Model: Uses the model: 'seyonec/PubChem10M_SMILES_BPE_450k' fine-tuned on beta-lactam structures.
  • ADMET Predictions: Uses the ADMET-AI library to predict molecular properties.
  • Visualization: Employs RDKit and SAFE encoding for molecule rendering.
  • Frameworks and Libraries:
    • Streamlit for the web interface.
    • Transformers library for model loading and generation.
    • RDKit for cheminformatics.

The application is intended for demonstration purposes and may require adjustments for production use.

License

This project is licensed under the terms of the MIT license.

Attributions and Acknowledgments

ChEMBL Database:

This project utilizes data from the ChEMBL Database, licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). Zdrazil B, Felix E, Hunter F, et al. The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Research. 2024;52(D1) . doi:10.1093/nar/gkad1004 https://www.ebi.ac.uk/chembl/

SAFE Encoding:

This project uses the SAFE Encoding framework, licensed under the Apache License 2.0. Noutahi E, Gabellini C, Craig M, Lim JS, Tossou P. Gotta be SAFE: A New Framework for Molecular Design. arXiv preprint arXiv:2310.10773, 2023. https://github.com/datamol-io/safe

This project utilizes the ADMET-AI platform for predicting ADMET properties:

Swanson K, Walther P, Leitz J, et al. ADMET-AI: A machine learning ADMET platform for evaluation of large-scale chemical libraries. bioRxiv. 2023. doi:10.1101/2023.12.28.573531 https://admet.ai.greenstonebio.com/

This project uses RDKit:

RDKit: Open-source cheminformatics. https://www.rdkit.org