Spaces:

HUBioDataLab
/

DrugGEN

Running

App Files Files Community

mgyigit commited on Mar 29

Commit

540e177

verified ·

1 Parent(s): 21ae81c

Update app.py

Browse files

Files changed (1) hide show

app.py +13 -9

app.py CHANGED Viewed

@@ -225,13 +225,17 @@ with gr.Blocks(theme=gr.themes.Ocean()) as demo:
                     ## Model Variations
                     ### DrugGEN-AKT1
-                    This model is designed to generate molecules targeting the human AKT1 protein (UniProt ID: P31749).
                     ### DrugGEN-CDK2
-                    This model is designed to generate molecules targeting the human CDK2 protein (UniProt ID: P24941).
                     ### DrugGEN-NoTarget
-                    This is a general-purpose model that generates diverse drug-like molecules without targeting a specific protein.
                     - Useful for exploring chemical space, generating diverse scaffolds, and creating molecules with drug-like properties.
                     For more details, see our [paper on arXiv](https://arxiv.org/abs/2302.07868).
@@ -247,13 +251,13 @@ with gr.Blocks(theme=gr.themes.Ocean()) as demo:
                     - **Runtime**: Time taken to generate or evaluate the molecules
                     ### Novelty Metrics
-                    - **Novelty (Train)**: Percentage of molecules not found in the training set
-                    - **Novelty (Inference)**: Percentage of molecules not found in the test set
                     - **Novelty (Real Inhibitors)**: Percentage of molecules not found in known inhibitors of the target protein
                     ### Structural Metrics
-                    - **Average Length**: Average component length in the generated molecules
-                    - **Mean Atom Type**: Average distribution of atom types
                     - **Internal Diversity**: Diversity within the generated set (higher is more diverse)
                     ### Drug-likeness Metrics
@@ -261,8 +265,8 @@ with gr.Blocks(theme=gr.themes.Ocean()) as demo:
                     - **SA Score (Synthetic Accessibility)**: Score from 1-10 indicating ease of synthesis (lower is better)
                     ### Similarity Metrics
-                    - **SNN ChEMBL**: Similarity to ChEMBL molecules (higher means more similar to known drug-like compounds)
-                    - **SNN Real Inhibitors**: Similarity to known drugs (higher means more similar to approved drugs)
                 """)
             model_name = gr.Radio(

                     ## Model Variations
                     ### DrugGEN-AKT1
+                    This model is designed to generate molecules targeting the human AKT1 protein (UniProt ID: P31749). Trained with [2,607 bioactive compounds](https://drive.google.com/file/d/1B2OOim5wrUJalixeBTDKXLHY8BAIvNh-/view?usp=drive_link).
+                    Molecules larger than 45 heavy atoms were excluded.
                     ### DrugGEN-CDK2
+                    This model is designed to generate molecules targeting the human CDK2 protein (UniProt ID: P24941). Trained with [1,817 bioactive compounds](https://drive.google.com/file/d/1C0CGFKx0I2gdSfbIEgUO7q3K2S1P9ksT/view?usp=drive_link)/
+                    Molecules larger than 38 heavy atoms were excluded.
                     ### DrugGEN-NoTarget
+                    This is a general-purpose model that generates diverse drug-like molecules without targeting a specific protein. Trained with a general [ChEMBL dataset]((https://drive.google.com/file/d/1oyybQ4oXpzrme_n0kbwc0-CFjvTFSlBG/view?usp=drive_link)
+                    Molecules larger than 45 heavy atoms were excluded.
                     - Useful for exploring chemical space, generating diverse scaffolds, and creating molecules with drug-like properties.
                     For more details, see our [paper on arXiv](https://arxiv.org/abs/2302.07868).
                     - **Runtime**: Time taken to generate or evaluate the molecules
                     ### Novelty Metrics
+                    - **Novelty (Train)**: Percentage of molecules not found in the [training set](https://drive.google.com/file/d/1oyybQ4oXpzrme_n0kbwc0-CFjvTFSlBG/view?usp=drive_link)
+                    - **Novelty (Inference)**: Percentage of molecules not found in the [test set](https://drive.google.com/file/d/1vMGXqK1SQXB3Od3l80gMWvTEOjJ5MFXP/view?usp=share_link)
                     - **Novelty (Real Inhibitors)**: Percentage of molecules not found in known inhibitors of the target protein
                     ### Structural Metrics
+                    - **Average Length**: Normalized average number of atoms in the generated molecules, normalized by the maximum atom count (e.g., 45 for AKT1/NoTarget, 38 for CDK2)
+                    - **Mean Atom Type**: Average number of distinct atom types in the generated molecules
                     - **Internal Diversity**: Diversity within the generated set (higher is more diverse)
                     ### Drug-likeness Metrics
                     - **SA Score (Synthetic Accessibility)**: Score from 1-10 indicating ease of synthesis (lower is better)
                     ### Similarity Metrics
+                    - **SNN ChEMBL**: Similarity to [ChEMBL molecules](https://drive.google.com/file/d/1oyybQ4oXpzrme_n0kbwc0-CFjvTFSlBG/view?usp=drive_link) (higher means more similar to known drug-like compounds)
+                    - **SNN Real Inhibitors**: Similarity to the real inhibitors of the selected target (higher means more similar to the real inhibitors)
                 """)
             model_name = gr.Radio(