File size: 5,012 Bytes
e9ee4f1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
---
license: apache-2.0
license_link: https://huggingface.co/skt/A.X-3.1/blob/main/LICENSE
language:
- en
- ko
pipeline_tag: text-classification
library_name: transformers
model_id: skt/A.X-Encoder-base
developers: SKT AI Model Lab
model-index:
- name: A.X-Encoder-base
results:
- task:
type: text-classification
name: kobest
metrics:
- type: KoBEST
value: 85.50
- task:
type: text-classification
name: klue
metrics:
- type: KLUE
value: 86.10
---
# A.X Encoder
<div align="center">
<img src="./assets/A.X_from_scratch_logo_ko_4x3.png" alt="A.X Logo" width="300"/>
</div>
## A.X Encoder Highlights
**A.X Encoder** (pronounced "A dot X") is SKT's document understanding model optimized for Korean-language understanding and enterprise deployment.
This lightweight encoder was developed entirely in-house by SKT, encompassing model architecture, data curation, and training, all carried out on SKTβs proprietary supercomputing infrastructure, TITAN.
This model utilizes the ModernBERT architecture, which supports flash attention and long-context processing.
- **Longer Context**: A.X Encoder supports long-context processing of up to **16,384** tokens.
- **Faster Inference**: A.X Encoder achieves up to 3x faster inference speed than earlier models.
- **Superior Korean Language Understanding**: A.X Encoder achieves superior performance on diverse Korean NLU tasks.
## Core Technologies
A.X Encoder represents **an efficient long document understanding model** for processing a large-scale corpus, developed end-to-end by SKT.
This model plays a key role in **data curation for A.X LLM** by serving as a versatile document classifier, identifying features such as educational value, domain category, and difficulty level.
## Benchmark Results
### Model Inference Speed (Run on an A100 GPU)
<div align="center">
<img src="./assets/speed.png" alt="inference" width="500"/>
</div>
### Model Performance
<div align="center">
<img src="./assets/performance.png" alt="performance" width="500"/>
</div>
| Method | BoolQ (f1) | COPA (f1) | Sentineg (f1) | WiC (f1) | **Avg. (KoBEST)** |
| ----------------------------- | ---------- | --------- | ------------- | -------- | ----------------- |
| **klue/roberta-base** | 72.04 | 65.14 | 90.39 | 78.19 | 76.44 |
| **kakaobank/kf-deberta-base** | 81.30 | 76.50 | 94.70 | 80.50 | 83.25 |
| **skt/A.X-Encoder-base** | 84.50 | 78.70 | 96.00 | 80.80 | **85.50** |
| Method | NLI (acc) | STS (f1) | YNAT (acc) | **Avg. (KLUE)** |
| ----------------------------- | --------- | -------- | ---------- | --------------- |
| **klue/roberta-base** | 84.53 | 84.57 | 86.48 | 85.19 |
| **kakaobank/kf-deberta-base** | 86.10 | 84.30 | 87.00 | 85.80 |
| **skt/A.X-Encoder-base** | 87.00 | 84.80 | 86.50 | **86.10** |
## π Quickstart
### with HuggingFace Transformers
- `transformers>=4.51.0` or the latest version is required to use `skt/A.X-Encoder-base`
```bash
pip install transformers>=4.51.0
```
β οΈ If your GPU supports it, we recommend using A.X Encoder with Flash Attention 2 to reach the highest efficiency. To do so, install Flash Attention as follows, then use the model as normal:
```bash
pip install flash-attn --no-build-isolation
```
#### Example Usage
Using AutoModelForMaskedLM:
```python
import torch
from transformers import AutoTokenizer, AutoModelForMaskedLM
model_id = "skt/A.X-Encoder-base"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForMaskedLM.from_pretrained(model_id, attn_implementation="flash_attention_2", torch_dtype=torch.bfloat16)
text = "νκ΅μ μλλ <mask>."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
# To get predictions for the mask:
masked_index = inputs["input_ids"][0].tolist().index(tokenizer.mask_token_id)
predicted_token_id = outputs.logits[0, masked_index].argmax(axis=-1)
predicted_token = tokenizer.decode(predicted_token_id)
print("Predicted token:", predicted_token)
# Predicted token: μμΈ
```
Using a pipeline:
```python
import torch
from transformers import pipeline
from pprint import pprint
pipe = pipeline(
"fill-mask",
model="skt/A.X-Encoder-base",
torch_dtype=torch.bfloat16,
)
input_text = "νκ΅μ μλλ <mask>."
results = pipe(input_text)
pprint(results)
# [{'score': 0.07568359375,
# 'sequence': 'νκ΅μ μλλ μμΈ.',
# 'token': 31430,
# 'token_str': 'μμΈ'}, ...
```
## License
The `A.X Encoder` model is licensed under `Apache License 2.0`.
## Citation
```
@article{SKTAdotXEncoder-base,
title={A.X Encoder-base},
author={SKT AI Model Lab},
year={2025},
url={https://huggingface.co/skt/A.X-Encoder-base}
}
```
## Contact
- Business & Partnership Contact: [[email protected]]([email protected]) |