yongchao chen commited on
Commit
6417eac
·
1 Parent(s): 886972e
Files changed (1) hide show
  1. README.md +116 -1
README.md CHANGED
@@ -1,4 +1,119 @@
1
  ---
2
  license: mit
3
  ---
4
- CodeSteerLLM finetuned on Llama-3.1-8B with multi-turn SFT and DPO.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+ # CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance
5
+ <img src="./Figures/Tag.png" width="650px" alt="s" />
6
+
7
+ [Huggingface🤗](https://huggingface.co/yongchao98/CodeSteer-v1)
8
+ [Model Weights](https://drive.google.com/drive/folders/1qb_rec6f8rMYtFKm0eQpad0L0uHCwgpL?usp=share_link)
9
+ [Finetune Datasets](https://drive.google.com/drive/folders/1Byn-99gFd5ckRkPMJ8-zagzW7XDfO8ie?usp=share_link)
10
+ [SymBench Datasets](https://github.com/yongchao98/CodeSteer-v1.0/tree/main/dataset_gather)
11
+ [SymBench Synthesis Scripts](https://github.com/yongchao98/CodeSteer-v1.0/tree/main/benchmark)
12
+
13
+ ## Contents
14
+
15
+ - [Framework](#Framework)
16
+ - [Inspirations](#Inspirations)
17
+ - [Performance](#Performance)
18
+ - [Environment_Setup](#Environment_Setup)
19
+ - [LLM_API_Key_Setup](#LLM_API_Key_Setup)
20
+ - [Train_and_Test_Models](#Train_and_Test_Models)
21
+ - [Assistance](#Assistance)
22
+ - [Citation](#Citation)
23
+
24
+ ## Framework
25
+ <img src="./Figures/CodeSteer-intro.png" width="800px" alt="s" />
26
+
27
+ <p align="center" style="font-size: 16px;">
28
+ Figure: CodeSteer on guiding LLM code/text generation to integrate symbolic computing. At each interaction with TaskLLM, it reviews current and previous answers, then provides guidance for the next round.
29
+ </p>
30
+
31
+ ## Inspirations
32
+ <img src="./Figures/LLM-makes-simple-mistakes-gather.png" width="800px" alt="s" />
33
+ <p align="center" style="font-size: 16px;">
34
+ Figure: The cases that GPT-4o makes simple mistakes by direct textual reasoning but can reliably solve the problem with prompted to use code.
35
+ </p>
36
+
37
+
38
+ ## Performance
39
+ We compare GPT-4o + CodeSteer with OpenAI o1 and DeepSeek R1 on SymBench, with 28 seen tasks and 9 unseen tasks. GPT-4o + CodeSteer surpasses o1 (82.7), R1 (76.8), and o1-preview (74.8), highlighting the importance of integrating symbolic computing into LLMs.
40
+
41
+ <img src="./Figures/Table-results.png" width="800px" alt="s" />
42
+
43
+ The cost of tokens and runtimes for each method are as follows. GPT-4o + CodeSteer costs less tokens and runtimes than o1 and R1.
44
+ <img src="./Figures/Cost-token-runtime.png" width="800px" alt="s" />
45
+
46
+ ## Environment_Setup
47
+ The fine-tuning and inference of CodeSteerLLM are based on [Llama-factory](https://github.com/hiyouga/LLaMA-Factory) with some modules modified by us.
48
+ ```
49
+ git clone https://github.com/yongchao98/CodeSteer-v1.0.git
50
+ cd CodeSteer-v1.0
51
+
52
+ conda create -n CodeSteer python=3.10
53
+ conda activate CodeSteer
54
+ pip install -r requirements.txt
55
+ ```
56
+
57
+ ## LLM_API_Key_Setup
58
+ If you want to use several API-based LLMs as TaskLLM or CodeSteerLLM, then you need to set up API key.
59
+
60
+ 1. First, create a .env file in your project root:
61
+ ```
62
+ OPENAI_API_KEY='your_key_here'
63
+ CLAUDE_API_KEY='your_key_here'
64
+ MIXTRAL_API_KEY='your_key_here'
65
+ DEEPSEEK_API_KEY='your_key_here'
66
+ ```
67
+ 2. Add this .env file to your .gitignore to prevent accidentally committing it:
68
+ ```
69
+ echo ".env" >> .gitignore
70
+ ```
71
+
72
+ ## Train_and_Test_Models
73
+
74
+ ### Create_test_samples
75
+ The synthesized test samples for 37 tasks of SymBench are in [dataset_gather](https://github.com/yongchao98/CodeSteer-v1.0/tree/main/dataset_gather) dictionary. You can also synthezise the samples by yourself with tunable complexities with scripts in [create_dataset](https://github.com/yongchao98/CodeSteer-v1.0/tree/main/create_dataset).
76
+
77
+ ### Run inference without GPU, test close LLM as CodeSteerLLM
78
+ We can directly use unfinetuned model like GPT-4o as CodeSteerLLM, in this case directly run
79
+ ```
80
+ python benchmark_test_baseline.py
81
+ ```
82
+
83
+ ### Run inference with GPU, test finetuned CodeSteerLLM
84
+ We can infer Llama-3.1-8B with own GPUs (default setting is in infer_CodeSteer.sh using 4*H100 of Harvard Cluster, please modify freely with your own cluster settings). You can also download the [Model Weights](https://drive.google.com/drive/folders/1qb_rec6f8rMYtFKm0eQpad0L0uHCwgpL?usp=share_link) in your local and change the path in llama3_8B_CodeSteer.yaml.
85
+
86
+ ```bash
87
+ bash infer_CodeSteer.sh
88
+ # default config file is ./llama3_8B_CodeSteer.yaml using the model uploaded on Huggingface.
89
+ ```
90
+
91
+ ### Finetuning CodeSteerLLM with synthesized data
92
+ Both our synthesized datasets of SFT and DPO finetuning are in [Finetune Datasets](https://drive.google.com/drive/folders/1Byn-99gFd5ckRkPMJ8-zagzW7XDfO8ie?usp=share_link).
93
+ We use Llama-factory and DeepSpeed for fintuning processes. First install Llama-factory with:
94
+ ```
95
+ git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
96
+ cd LLaMA-Factory
97
+ pip install -e ".[torch,metrics]"
98
+ cd ..
99
+ ```
100
+
101
+ Then we run the code with (default setting is in train_llama3-8B-CodeSteer.sh using 4*H100 of Harvard Cluster, please modify freely with your own cluster settings):
102
+ ```
103
+ bash train_llama3-8B-CodeSteer.sh
104
+ ```
105
+
106
+ ## Assistance
107
+
108
+ We appreciate all feedback! Feel free to raise an issue for bugs, questions, or suggestions. Contacting [Yongchao Chen](https://yongchao98.github.io/YongchaoChen/) and [Chuchu Fan](https://chuchu.mit.edu) for any questions and discussion.
109
+
110
+ ## Citation
111
+ ```md
112
+ @article{chen2024steering,
113
+ title={Steering Large Language Models between Code Execution and Textual Reasoning},
114
+ author={Chen, Yongchao and Jhamtani, Harsh and Sharma, Srinagesh and Fan, Chuchu and Wang, Chi},
115
+ journal={arXiv preprint arXiv:2410.03524},
116
+ year={2024}
117
+ }
118
+ ```
119
+