hf-transformers

Running

File size: 2,646 Bytes

1298030
 
9d0d0bd
1298030
9d0d0bd
1298030
9d0d0bd
1298030
9d0d0bd
1298030
9d0d0bd
1298030
9d0d0bd
1298030
9d0d0bd
1298030
9d0d0bd
1298030
 
 
9d0d0bd
1298030
9d0d0bd
1298030
9d0d0bd
1298030
9d0d0bd
1298030
9d0d0bd
1298030
9d0d0bd
 
 
 
 
 
 
1298030
 
 
 
 
9d0d0bd
1298030
 
 
 
 
9d0d0bd
1298030
9d0d0bd
1298030
 
 
 
 
9d0d0bd
1298030

# Model documentation & parameters

**Language model**: Type of language model to be used.

**Text prompt**: The text prompt to condition the model.

**Maximal length**: The maximal number of SMILES tokens in the generated molecule.

**Decoding temperature**: The temperature in the beam search decoding.

**Prefix**: A text prompt that will be passed to the mode **before** the prompt.

**Top-k**: Number of top-k probability tokens to keep.

**Decoding-p**: Only tokens with cumulative probabilities summing up to this value are kept.

**Repetition penalty**: Penalty for repeating tokens. Leave unchanged, but for CTRL model, use 1.2.



# Model card -- HuggingFace

**Model Details**: Various Transformer-based language models.

**Developers**: HuggingFace developers 

**Distributors**: HuggingFace developers' code integrated into GT4SD.

**Model date**: Varies between models.

**Model type**: Different types of `transformers` language models:
- CTRL: `CTRLLMHeadModel`
- GPT2: `GPT2LMHeadModel`
- XLNet: `XLNetLMHeadModel`
- OpenAIGPT: `OpenAIGPTLMHeadModel`
- TransfoXL: `TransfoXLLMHeadModel`
- XLM: `XLMWithLMHeadModel`

**Information about training algorithms, parameters, fairness constraints or other applied approaches, and features**: 
N.A.

**Paper or other resource for more information**: 
All documentation available from [transformers documentation](https://huggingface.co/docs/transformers/)

**License**: MIT

**Where to send questions or comments about the model**: Open an issue on [GT4SD repository](https://github.com/GT4SD/gt4sd-core).

**Intended Use. Use cases that were envisioned during development**: N.A.

**Primary intended uses/users**: N.A.

**Out-of-scope use cases**: Production-level inference, producing molecules with harmful properties.

**Metrics**: N.A.

**Datasets**: N.A.

**Ethical Considerations**: Unclear, please consult with original authors in case of questions.

**Caveats and Recommendations**: Unclear, please consult with original authors in case of questions.

Model card prototype inspired by [Mitchell et al. (2019)](https://dl.acm.org/doi/abs/10.1145/3287560.3287596?casa_token=XD4eHiE2cRUAAAAA:NL11gMa1hGPOUKTAbtXnbVQBDBbjxwcjGECF_i-WC_3g1aBgU1Hbz_f2b4kI_m1in-w__1ztGeHnwHs)

## Citation
```bib
@article{manica2022gt4sd,
  title={GT4SD: Generative Toolkit for Scientific Discovery},
  author={Manica, Matteo and Cadow, Joris and Christofidellis, Dimitrios and Dave, Ashish and Born, Jannis and Clarke, Dean and Teukam, Yves Gaetan Nana and Hoffman, Samuel C and Buchan, Matthew and Chenthamarakshan, Vijil and others},
  journal={arXiv preprint arXiv:2207.03928},
  year={2022}
}
```