Update README.md
Browse files
README.md
CHANGED
@@ -14,4 +14,18 @@ pipeline_tag: text-generation
|
|
14 |
|
15 |
# GuardReasoner 3B
|
16 |
|
17 |
-
This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) via R-SFT and HS-DPO, as described in [GuardReasoner: Towards Reasoning-based LLM Safeguards](https://huggingface.co/papers/2501.18492).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
|
15 |
# GuardReasoner 3B
|
16 |
|
17 |
+
This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) via R-SFT and HS-DPO, as described in [GuardReasoner: Towards Reasoning-based LLM Safeguards](https://huggingface.co/papers/2501.18492).
|
18 |
+
|
19 |
+
|
20 |
+
The training data of R-SFT can be found in [GuardReasonerTrain](https://huggingface.co/datasets/yueliu1999/GuardReasonerTrain).
|
21 |
+
|
22 |
+
|
23 |
+
```
|
24 |
+
@article{GuardReasoner,
|
25 |
+
title={GuardReasoner: Towards Reasoning-based LLM Safeguards},
|
26 |
+
author={Liu, Yue and Gao, Hongcheng and Zhai, Shengfang and Jun, Xia and Wu, Tianyi and Xue, Zhiwei and Chen, Yulin and Kawaguchi, Kenji and Zhang, Jiaheng and Hooi, Bryan},
|
27 |
+
journal={arXiv preprint arXiv:2501.18492},
|
28 |
+
year={2025}
|
29 |
+
}
|
30 |
+
```
|
31 |
+
|