jannisborn's picture
update
9d0d0bd unverified
|
raw
history blame
2.65 kB
# Model documentation & parameters
**Language model**: Type of language model to be used.
**Text prompt**: The text prompt to condition the model.
**Maximal length**: The maximal number of SMILES tokens in the generated molecule.
**Decoding temperature**: The temperature in the beam search decoding.
**Prefix**: A text prompt that will be passed to the mode **before** the prompt.
**Top-k**: Number of top-k probability tokens to keep.
**Decoding-p**: Only tokens with cumulative probabilities summing up to this value are kept.
**Repetition penalty**: Penalty for repeating tokens. Leave unchanged, but for CTRL model, use 1.2.
# Model card -- HuggingFace
**Model Details**: Various Transformer-based language models.
**Developers**: HuggingFace developers
**Distributors**: HuggingFace developers' code integrated into GT4SD.
**Model date**: Varies between models.
**Model type**: Different types of `transformers` language models:
- CTRL: `CTRLLMHeadModel`
- GPT2: `GPT2LMHeadModel`
- XLNet: `XLNetLMHeadModel`
- OpenAIGPT: `OpenAIGPTLMHeadModel`
- TransfoXL: `TransfoXLLMHeadModel`
- XLM: `XLMWithLMHeadModel`
**Information about training algorithms, parameters, fairness constraints or other applied approaches, and features**:
N.A.
**Paper or other resource for more information**:
All documentation available from [transformers documentation](https://huggingface.co/docs/transformers/)
**License**: MIT
**Where to send questions or comments about the model**: Open an issue on [GT4SD repository](https://github.com/GT4SD/gt4sd-core).
**Intended Use. Use cases that were envisioned during development**: N.A.
**Primary intended uses/users**: N.A.
**Out-of-scope use cases**: Production-level inference, producing molecules with harmful properties.
**Metrics**: N.A.
**Datasets**: N.A.
**Ethical Considerations**: Unclear, please consult with original authors in case of questions.
**Caveats and Recommendations**: Unclear, please consult with original authors in case of questions.
Model card prototype inspired by [Mitchell et al. (2019)](https://dl.acm.org/doi/abs/10.1145/3287560.3287596?casa_token=XD4eHiE2cRUAAAAA:NL11gMa1hGPOUKTAbtXnbVQBDBbjxwcjGECF_i-WC_3g1aBgU1Hbz_f2b4kI_m1in-w__1ztGeHnwHs)
## Citation
```bib
@article{manica2022gt4sd,
title={GT4SD: Generative Toolkit for Scientific Discovery},
author={Manica, Matteo and Cadow, Joris and Christofidellis, Dimitrios and Dave, Ashish and Born, Jannis and Clarke, Dean and Teukam, Yves Gaetan Nana and Hoffman, Samuel C and Buchan, Matthew and Chenthamarakshan, Vijil and others},
journal={arXiv preprint arXiv:2207.03928},
year={2022}
}
```