File size: 1,676 Bytes
5b022c8 0f2ad6a 5b022c8 7b745fe 5b022c8 03f96d6 5b022c8 104cbf1 5b022c8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
---
license: bsd
tags:
- chemistry
- biology
- protein
- antibodies
- antibody
- light chain
- AbLang
- CDR
- OAS
---
# AbLang model for light chains
This is a huggingface version of AbLang: A language model for antibodies. It was introduced in
[this paper](https://doi.org/10.1101/2022.01.20.477061) and first released in
[this repository](https://github.com/oxpig/AbLang). This model is trained on uppercase amino acids: it only works with capital letter amino acids.
# Intended uses & limitations
The model could be used for protein feature extraction or to be fine-tuned on downstream tasks (TBA).
### How to use
Since this is a custom model, you need to install additional dependencies:
```python
pip install ablang
```
Here is how to use this model to get the features of a given antibody sequence in PyTorch:
```python
from transformers import AutoModel, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained('qilowoq/AbLang_light')
model = AutoModel.from_pretrained('qilowoq/AbLang_light', trust_remote_code=True)
sequence_Example = ' '.join("GSELTQDPAVSVALGQTVRITCQGDSLRNYYASWYQQKPRQAPVLVFYGKNNRPSGIPDRFSGSSSGNTASLTISGAQAEDEADYYCNSRDSSSNHLVFGGGTKLTVLSQ")
encoded_input = tokenizer(sequence_Example, return_tensors='pt')
model_output = model(**encoded_input)
```
Sequence embeddings can be produced as follows:
```python
seq_embs = model_output.last_hidden_state[:, 0, :]
```
### Citation
```
@article{Olsen2022,
title={AbLang: An antibody language model for completing antibody sequences},
author={Tobias H. Olsen, Iain H. Moal and Charlotte M. Deane},
journal={bioRxiv},
doi={https://doi.org/10.1101/2022.01.20.477061},
year={2022}
}
``` |