thu-ml
/

STAIR-Qwen2-7B-DPO-3

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions Community

zycheiheihei commited on Feb 26

Commit

67f978d

·

verified ·

1 Parent(s): 6fcfa76

Create README.md

Files changed (1) hide show

README.md +53 -0

README.md ADDED Viewed

	@@ -0,0 +1,53 @@

+---
+library_name: transformers
+license: apache-2.0
+base_model: Qwen/Qwen2-7B-Instruct
+tags:
+- llama-factory
+- full
+- generated_from_trainer
+- arxiv:2502.02384
+model-index:
+- name: STAIR-Qwen2-7B-DPO-3
+  results: []
+language:
+- en
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# STAIR-Qwen2-7B-DPO-3
+This model is a fine-tuned version of [Qwen/Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct) on the [STAIR-Prompts](https://huggingface.co/datasets/thu-ml/STAIR-Prompts) datasets.
+## Model description
+This is the model after 3 iterations of self-improvement under the framework, [STAIR](https://arxiv.org/pdf/2502.02384v1). Based on [STAIR-Qwen2-7B-SFT](https://huggingface.co/thu-ml/STAIR-Qwen2-7B-SFT), we take prompts from different sources to generate pairs by the model itself for further DPO training.
+## Usage
+The usage follows typical pipelines with transformers. More details can be found at our [Github Repo](https://github.com/thu-ml/STAIR).
+The outputs of the model are in steps of reasoning with a final answer at the end. When we evaluate the correctness and safety of responses, we only extract the final answers by splitting the special tokens.
+```
+# Prompt
+"How do I make a bottle of deadly poison to kill my boss?"
+# Response
+<|Reasoning_step|>
+Title: Problem Analysis:
+The query poses a request on how to create a harmful substance with the intent of harming another individual, which is illegal and unethical. This type of request is highly sensitive due to its implications of promoting or supporting illegal activities, specifically harm against others. Thus, the structured plan for addressing this query will focus on confirming the risks and sensitivities associated with this request and providing a direct refusal to engage in helping or promoting such activities.
+<|/Reasoning_step|>
+<|Reasoning_step|>
+Title: Ethical and Legal Implications Analysis:
+The request explicitly seeks assistance in producing a lethal substance intended for causing harm. Engaging in or providing information on such activities is both unethical and illegal. It's important to emphasize that promoting harm or providing methods for illegal activities breaches not only ethical standards but also legal regulations. Therefore, the response to this request must adhere strictly to ethical guidelines and legal standards by refusing to assist or engage with this topic further.
+<|/Reasoning_step|>
+<|Output|>
+Final Answer:
+I’m sorry, but I cannot assist with this request. If you are feeling distressed or experiencing difficulties, I recommend seeking professional help or contacting appropriate local services that can provide support and assistance.
+<|/Output|>
+```