ericflo commited on
Commit
79e2ba6
·
verified ·
1 Parent(s): b710acc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -9,6 +9,8 @@ datasets:
9
 
10
  # Qwen2.5-Think-KTO v0.1: A Reasoning-Enhanced Language Model
11
 
 
 
12
  ## What's New in v0.1
13
  This initial release enhances the base Qwen2.5-7B model's reasoning capabilities using Kahneman-Tversky Optimization (KTO). The model is trained using binary feedback signals, indicating whether outputs are desirable or undesirable for given inputs.
14
 
 
9
 
10
  # Qwen2.5-Think-KTO v0.1: A Reasoning-Enhanced Language Model
11
 
12
+ **NOTE**: This model is currently undertrained and needs some coaxing to output `<think>...</think>` tags.
13
+
14
  ## What's New in v0.1
15
  This initial release enhances the base Qwen2.5-7B model's reasoning capabilities using Kahneman-Tversky Optimization (KTO). The model is trained using binary feedback signals, indicating whether outputs are desirable or undesirable for given inputs.
16