Leon-Leee commited on
Commit
7adce78
·
verified ·
1 Parent(s): cbafc7a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -13
README.md CHANGED
@@ -5,6 +5,7 @@ datasets:
5
  - m-a-p/Code-Feedback
6
  - openbmb/UltraInteract_sft
7
  - ise-uiuc/Magicoder-Evol-Instruct-110K
 
8
  language:
9
  - en
10
  metrics:
@@ -16,11 +17,8 @@ tags:
16
  ## AIGCodeGeek-DS-6.7B
17
 
18
  ### Introduction
19
- AIGCodeGeek-DS-6.7B is the first version of our Code-LLM family with competitive performance on benchmarks such as HumanEval(+) and MBPP(+).
20
-
21
- It gains a lot of insights from the open-source community and we deeply appreciate all of these great works.
22
-
23
- We are preparing for the tech report, so stay tuned for more details:)
24
 
25
  ### Model Details
26
  #### Model Description
@@ -29,11 +27,10 @@ We are preparing for the tech report, so stay tuned for more details:)
29
  - Fine-tuned from [deepseek-ai/deepseek-coder-6.7b-base](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-base) with full parameters
30
 
31
  ### Training data
32
- A mixture of both
33
- - samples from several high-quality open-source datasets (read *Acknowledgements*),
34
- - our private datasets (already decontaminated with benchmarks).
35
 
36
  ### Evaluation
 
37
  To check out our evaluation results: [EvalPlus](https://evalplus.github.io/leaderboard.html)
38
 
39
  ### Requirements
@@ -59,8 +56,10 @@ TBD
59
 
60
 
61
  ### Acknowledgements
62
- - [WizardCoder](https://github.com/nlpxucan/WizardLM/tree/main/WizardCoder): WizardLM-Evol-Instruct V2 datasets
63
- - We used a back-up([Leon-Leee/wizardlm_evol_instruct_v2_196K_backuped](https://huggingface.co/datasets/Leon-Leee/wizardlm_evol_instruct_v2_196K_backuped)) since this dataset has been deleted.
64
- - [Magicoder](https://github.com/ise-uiuc/magicoder/): [Magicoder-Evol-Instruct-110K](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K) from theblackcat102/evol-codealpaca-v1(https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1)
65
- - [Eurus](https://github.com/OpenBMB/Eurus): reasoning enhancement dataset of [openbmb/UltraInteract_sft](https://huggingface.co/datasets/openbmb/UltraInteract_sft)
66
- - [OpenCoderInterpreter](https://opencodeinterpreter.github.io/): [m-a-p/Code-Feedback](https://huggingface.co/datasets/m-a-p/Code-Feedback)
 
 
 
5
  - m-a-p/Code-Feedback
6
  - openbmb/UltraInteract_sft
7
  - ise-uiuc/Magicoder-Evol-Instruct-110K
8
+ - flytech/python-codes-25k
9
  language:
10
  - en
11
  metrics:
 
17
  ## AIGCodeGeek-DS-6.7B
18
 
19
  ### Introduction
20
+ AIGCodeGeek-DS-6.7B is the first released version of our Code-LLM family with competitive performance on benchmarks such as HumanEval(+) and MBPP(+).
21
+ We are preparing for a tech report; stay tuned for more details:)
 
 
 
22
 
23
  ### Model Details
24
  #### Model Description
 
27
  - Fine-tuned from [deepseek-ai/deepseek-coder-6.7b-base](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-base) with full parameters
28
 
29
  ### Training data
30
+ A mixture of samples from high-quality open-source datasets (read *Acknowledgements*).
 
 
31
 
32
  ### Evaluation
33
+
34
  To check out our evaluation results: [EvalPlus](https://evalplus.github.io/leaderboard.html)
35
 
36
  ### Requirements
 
56
 
57
 
58
  ### Acknowledgements
59
+ We gain a lot of knowledge and resources from the open-source community:
60
+ - [DeepSeekCoder](https://huggingface.co/deepseek-ai): impressive performance and insightful tech reports
61
+ - [WizardCoder](https://github.com/nlpxucan/WizardLM/tree/main/WizardCoder): Evol Instruct method and datasets
62
+ - We used a ([Leon-Leee/wizardlm_evol_instruct_v2_196K_backuped](https://huggingface.co/datasets/Leon-Leee/wizardlm_evol_instruct_v2_196K_backuped)) since this original has been deleted.
63
+ - [Magicoder](https://github.com/ise-uiuc/magicoder/): OSS-Instruct method and datasets, [Magicoder-Evol-Instruct-110K](https://huggingface.co/datasets/ise-uiuc/Magicoder-Evol-Instruct-110K) from theblackcat102/evol-codealpaca-v1(https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1)
64
+ - [Eurus](https://github.com/OpenBMB/Eurus): creative methods and datasets for reasoning, [openbmb/UltraInteract_sft](https://huggingface.co/datasets/openbmb/UltraInteract_sft)
65
+ - [OpenCoderInterpreter](https://opencodeinterpreter.github.io/): well-designed experiments and [m-a-p/Code-Feedback](https://huggingface.co/datasets/m-a-p/Code-Feedback)