--- license: mit title: Beta-lactam Generator sdk: streamlit emoji: 📚 colorFrom: blue colorTo: gray short_description: App to generate and view beta-lactam structures --- # Beta-Lactam Molecule Generator and Viewer ## Overview This application demonstrates a drug discovery pipeline that allows users to: * Generate novel beta-lactam molecules using a generative AI model that was fine-tuned with beta-lactam structures. * View the generated molecules with SMILES and SAFE strings. * Predict select ADMET properties for the generated molecules using ADMET-AI. ## Features * **Molecule Generation**: * Generates up to 3 beta-lactam molecules at a time. * Users can adjust the creativity (temperature) of the generation process. Higher number leads to more diverse output. * Generated molecules are named 'Mol01' to 'Mol03'. * **Molecule Viewing**: * Displays molecule structures using Streamlit. * View molecules as SMILES and SAFE encodings. * **ADMET Property Prediction**: * Integrates ADMET-AI to predict select properties. * Displays predicted properties of each molecule. ## How to Use the App 1. Set Generation Parameters: * Use the sidebar to adjust the creativity (temperature) slider. * Select the number of molecules to generate (maximum of 3). 2. Generate Molecules: * Click the 'Generate Molecules' button. * Generated molecules will appear with their structures, strings, and predicted ADMET properties. ## Technical Details * **Generative Model**: Uses the model: 'seyonec/PubChem10M_SMILES_BPE_450k' fine-tuned on beta-lactam structures. * **ADMET Predictions**: Uses the ADMET-AI library to predict molecular properties. * **Visualization**: Employs RDKit and SAFE encoding for molecule rendering. * **Frameworks and Libraries**: * **Streamlit** for the web interface. * **Transformers** library for model loading and generation. * **RDKit** for cheminformatics. ### The application is intended for demonstration purposes only. ## License This project is licensed under the terms of the MIT license. ## Attributions and Acknowledgments ### ChEMBL Database: This project utilizes data from the ChEMBL Database, licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). Zdrazil B, Felix E, Hunter F, et al. The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Research. 2024;52(D1) . doi:10.1093/nar/gkad1004 https://www.ebi.ac.uk/chembl/ ### SAFE Encoding: This project uses the SAFE Encoding framework, licensed under the Apache License 2.0. Noutahi E, Gabellini C, Craig M, Lim JS, Tossou P. Gotta be SAFE: A New Framework for Molecular Design. arXiv preprint arXiv:2310.10773, 2023. https://github.com/datamol-io/safe ### This project utilizes the ADMET-AI platform for predicting ADMET properties: Swanson K, Walther P, Leitz J, et al. ADMET-AI: A machine learning ADMET platform for evaluation of large-scale chemical libraries. bioRxiv. 2023. doi:10.1101/2023.12.28.573531 https://admet.ai.greenstonebio.com/ ### This project uses RDKit: RDKit: Open-source cheminformatics. https://www.rdkit.org