Novaciano commited on
Commit
51a5732
·
verified ·
1 Parent(s): 30791fe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +111 -0
README.md CHANGED
@@ -48,9 +48,120 @@ tags:
48
  language:
49
  - es
50
  - en
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
  ---
52
  <center> <h4><b>HARMFUL PROJECT</b></h4>
53
  <img src="https://i.ibb.co/3yqnMb7z/AQMEx-J7-A5c-F5r-SWsn8-CVc-Qms-Fa-RKi6y-Zsnp7-L5ca-Afcws-OKi-WDQLs-Mm0-YH6i-DEke-V6-HHIf-P0-XVBEbrb.gif" alt="AQMEx-J7-A5c-F5r-SWsn8-CVc-Qms-Fa-RKi6y-Zsnp7-L5ca-Afcws-OKi-WDQLs-Mm0-YH6i-DEke-V6-HHIf-P0-XVBEbrb-" border="0"></a> </center>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
  # CORRECTED VERSION OF HARMFUL PROJECT 3.2 1B
56
 
 
48
  language:
49
  - es
50
  - en
51
+ model-index:
52
+ - name: HarmfulProject-3.2-1B
53
+ results:
54
+ - task:
55
+ type: text-generation
56
+ name: Text Generation
57
+ dataset:
58
+ name: IFEval (0-Shot)
59
+ type: HuggingFaceH4/ifeval
60
+ args:
61
+ num_few_shot: 0
62
+ metrics:
63
+ - type: inst_level_strict_acc and prompt_level_strict_acc
64
+ value: 38.74
65
+ name: strict accuracy
66
+ source:
67
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Novaciano/HarmfulProject-3.2-1B
68
+ name: Open LLM Leaderboard
69
+ - task:
70
+ type: text-generation
71
+ name: Text Generation
72
+ dataset:
73
+ name: BBH (3-Shot)
74
+ type: BBH
75
+ args:
76
+ num_few_shot: 3
77
+ metrics:
78
+ - type: acc_norm
79
+ value: 6.51
80
+ name: normalized accuracy
81
+ source:
82
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Novaciano/HarmfulProject-3.2-1B
83
+ name: Open LLM Leaderboard
84
+ - task:
85
+ type: text-generation
86
+ name: Text Generation
87
+ dataset:
88
+ name: MATH Lvl 5 (4-Shot)
89
+ type: hendrycks/competition_math
90
+ args:
91
+ num_few_shot: 4
92
+ metrics:
93
+ - type: exact_match
94
+ value: 4.76
95
+ name: exact match
96
+ source:
97
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Novaciano/HarmfulProject-3.2-1B
98
+ name: Open LLM Leaderboard
99
+ - task:
100
+ type: text-generation
101
+ name: Text Generation
102
+ dataset:
103
+ name: GPQA (0-shot)
104
+ type: Idavidrein/gpqa
105
+ args:
106
+ num_few_shot: 0
107
+ metrics:
108
+ - type: acc_norm
109
+ value: 2.24
110
+ name: acc_norm
111
+ source:
112
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Novaciano/HarmfulProject-3.2-1B
113
+ name: Open LLM Leaderboard
114
+ - task:
115
+ type: text-generation
116
+ name: Text Generation
117
+ dataset:
118
+ name: MuSR (0-shot)
119
+ type: TAUR-Lab/MuSR
120
+ args:
121
+ num_few_shot: 0
122
+ metrics:
123
+ - type: acc_norm
124
+ value: 2.73
125
+ name: acc_norm
126
+ source:
127
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Novaciano/HarmfulProject-3.2-1B
128
+ name: Open LLM Leaderboard
129
+ - task:
130
+ type: text-generation
131
+ name: Text Generation
132
+ dataset:
133
+ name: MMLU-PRO (5-shot)
134
+ type: TIGER-Lab/MMLU-Pro
135
+ config: main
136
+ split: test
137
+ args:
138
+ num_few_shot: 5
139
+ metrics:
140
+ - type: acc
141
+ value: 9.14
142
+ name: accuracy
143
+ source:
144
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Novaciano/HarmfulProject-3.2-1B
145
+ name: Open LLM Leaderboard
146
  ---
147
  <center> <h4><b>HARMFUL PROJECT</b></h4>
148
  <img src="https://i.ibb.co/3yqnMb7z/AQMEx-J7-A5c-F5r-SWsn8-CVc-Qms-Fa-RKi6y-Zsnp7-L5ca-Afcws-OKi-WDQLs-Mm0-YH6i-DEke-V6-HHIf-P0-XVBEbrb.gif" alt="AQMEx-J7-A5c-F5r-SWsn8-CVc-Qms-Fa-RKi6y-Zsnp7-L5ca-Afcws-OKi-WDQLs-Mm0-YH6i-DEke-V6-HHIf-P0-XVBEbrb-" border="0"></a> </center>
149
+ dtype: bfloat16
150
+ parameters:
151
+ t: [0, 0.5, 1, 0.5, 0]
152
+ ```
153
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
154
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/Novaciano__HarmfulProject-3.2-1B-details)
155
+
156
+ | Metric |Value|
157
+ |-------------------|----:|
158
+ |Avg. |10.69|
159
+ |IFEval (0-Shot) |38.74|
160
+ |BBH (3-Shot) | 6.51|
161
+ |MATH Lvl 5 (4-Shot)| 4.76|
162
+ |GPQA (0-shot) | 2.24|
163
+ |MuSR (0-shot) | 2.73|
164
+ |MMLU-PRO (5-shot) | 9.14|
165
 
166
  # CORRECTED VERSION OF HARMFUL PROJECT 3.2 1B
167