Spaces:
Running
Running
File size: 3,075 Bytes
09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a a4eba41 09c907a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
# Model documentation & parameters
**Algorithm Version**: Which model version to use.
**Target binding energy**: The desired binding energy.
**Primer SMILES**: A SMILES string used to prime the generation.
**Maximal sequence length**: The maximal number of SMILES tokens in the generated molecule.
**Number of points**: Number of points to sample with the Gaussian Process.
**Number of steps**: Number of optimization steps in the Gaussian Process optimization.
**Number of samples**: How many samples should be generated (between 1 and 50).
# Model card -- AdvancedManufacturing
**Model Details**: *AdvancedManufacturing* is a sequence-based molecular generator tuned to generate catalysts. The model relies on a recurrent Variational Autoencoder with a binding-energy predictor trained on the latent code. The framework uses Gaussian Processes for generating targeted molecules.
**Developers**: Oliver Schilter and colleagues from IBM Research.
**Distributors**: Original authors' code integrated into GT4SD.
**Model date**: Not yet published.
**Model version**: Different types of models trained on NCCR data using SMILES or SELFIES, potentially also with augmentation.
**Model type**: A sequence-based molecular generator tuned to generate catalysts. The model relies on a recurrent Variational Autoencoder with a binding-energy predictor trained on the latent code. The framework uses Gaussian Processes for generating targeted molecules.
**Information about training algorithms, parameters, fairness constraints or other applied approaches, and features**:
N.A.
**Paper or other resource for more information**:
TBD
**License**: MIT
**Where to send questions or comments about the model**: Open an issue on [GT4SD repository](https://github.com/GT4SD/gt4sd-core).
**Intended Use. Use cases that were envisioned during development**: Chemical research, in particular drug discovery.
**Primary intended uses/users**: Researchers and computational chemists using the model for model comparison or research exploration purposes.
**Out-of-scope use cases**: Production-level inference, producing molecules with harmful properties.
**Metrics**: N.A.
**Datasets**: Data provided through NCCR.
**Ethical Considerations**: Unclear, please consult with original authors in case of questions.
**Caveats and Recommendations**: Unclear, please consult with original authors in case of questions.
Model card prototype inspired by [Mitchell et al. (2019)](https://dl.acm.org/doi/abs/10.1145/3287560.3287596?casa_token=XD4eHiE2cRUAAAAA:NL11gMa1hGPOUKTAbtXnbVQBDBbjxwcjGECF_i-WC_3g1aBgU1Hbz_f2b4kI_m1in-w__1ztGeHnwHs)
## Citation
TBD, temporarily please cite:
```bib
@article{manica2022gt4sd,
title={GT4SD: Generative Toolkit for Scientific Discovery},
author={Manica, Matteo and Cadow, Joris and Christofidellis, Dimitrios and Dave, Ashish and Born, Jannis and Clarke, Dean and Teukam, Yves Gaetan Nana and Hoffman, Samuel C and Buchan, Matthew and Chenthamarakshan, Vijil and others},
journal={arXiv preprint arXiv:2207.03928},
year={2022}
}
``` |