ericflo
/

Qwen2.5-7B-Think-KTO-v0.1

Text Generation

text-generation-inference

Model card Files Files and versions Community

ericflo commited on Jan 28

Commit

79e2ba6

·

verified ·

1 Parent(s): b710acc

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -9,6 +9,8 @@ datasets:
 # Qwen2.5-Think-KTO v0.1: A Reasoning-Enhanced Language Model
 ## What's New in v0.1
 This initial release enhances the base Qwen2.5-7B model's reasoning capabilities using Kahneman-Tversky Optimization (KTO). The model is trained using binary feedback signals, indicating whether outputs are desirable or undesirable for given inputs.

 # Qwen2.5-Think-KTO v0.1: A Reasoning-Enhanced Language Model
+**NOTE**: This model is currently undertrained and needs some coaxing to output `<think>...</think>` tags.
 ## What's New in v0.1
 This initial release enhances the base Qwen2.5-7B model's reasoning capabilities using Kahneman-Tversky Optimization (KTO). The model is trained using binary feedback signals, indicating whether outputs are desirable or undesirable for given inputs.