beyoru commited on
Commit
8963dab
·
verified ·
1 Parent(s): fa508a8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -12,6 +12,11 @@ library_name: transformers
12
  ## Model details
13
  This model is fine-tuned from Qwen3-4B-Instruct using a custom reinforcement learning (RL) framework that rewards the model for producing solutions passing automated test cases — similar to the process of programming task evaluation on LeetCode.
14
 
 
 
 
 
 
15
  Instead of relying on labeled ground truth answers, the model learns through test-case-based rewards, promoting generalization and reasoning ability in algorithmic problem-solving.
16
 
17
  > This is an experimental model
 
12
  ## Model details
13
  This model is fine-tuned from Qwen3-4B-Instruct using a custom reinforcement learning (RL) framework that rewards the model for producing solutions passing automated test cases — similar to the process of programming task evaluation on LeetCode.
14
 
15
+ <p align="center">
16
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/65905af887944e494e37e09a/s4drmYGEYWZyt2ZUkxIpI.png" width="300">
17
+ </p>
18
+
19
+
20
  Instead of relying on labeled ground truth answers, the model learns through test-case-based rewards, promoting generalization and reasoning ability in algorithmic problem-solving.
21
 
22
  > This is an experimental model