Commit
·
ea6ec6b
1
Parent(s):
12c37bc
Update README.md
Browse files
README.md
CHANGED
@@ -15,9 +15,6 @@ language:
|
|
15 |
- fr
|
16 |
widget:
|
17 |
- text: "Assurés de disputer l'Euro 2024 en Allemagne l'été prochain (du 14 juin au 14 juillet) depuis leur victoire aux Pays-Bas, les Bleus ont fait le nécessaire pour avoir des certitudes. Avec six victoires en six matchs officiels et un seul but encaissé, Didier Deschamps a consolidé les acquis de la dernière Coupe du monde. Les joueurs clés sont connus : Kylian Mbappé, Aurélien Tchouameni, Antoine Griezmann, Ibrahima Konaté ou encore Mike Maignan."
|
18 |
-
inference:
|
19 |
-
parameters:
|
20 |
-
aggregation_strategy: "max"
|
21 |
library_name: transformers
|
22 |
pipeline_tag: token-classification
|
23 |
co2_eq_emissions: 20
|
@@ -108,40 +105,30 @@ The evaluation was carried out using the [**evaluate**](https://pypi.org/project
|
|
108 |
<tr>
|
109 |
<td rowspan="3"><br>Camembert-base-frenchNER_4entities</td>
|
110 |
<td><br>Precision</td>
|
111 |
-
<td><br>
|
112 |
-
<td><br>
|
113 |
-
<td><br>
|
114 |
-
<td><br>
|
115 |
-
<td><br>
|
116 |
-
<td><br>
|
117 |
</tr>
|
118 |
<tr>
|
119 |
<td><br>Recall</td>
|
120 |
-
<td><br>
|
121 |
-
<td><br>
|
122 |
-
<td><br>
|
123 |
-
<td><br>
|
124 |
-
<td><br>
|
125 |
-
<td><br>
|
126 |
</tr>
|
127 |
<tr>
|
128 |
<td>F1</td>
|
129 |
-
<td><br>
|
130 |
-
<td><br>
|
131 |
-
<td><br>
|
132 |
-
<td><br>
|
133 |
-
<td><br>
|
134 |
-
<td><br>
|
135 |
-
</tr>
|
136 |
-
<tr>
|
137 |
-
<td></td>
|
138 |
-
<td><br>Number</td>
|
139 |
-
<td><br>A</td>
|
140 |
-
<td><br>B</td>
|
141 |
-
<td><br>C</td>
|
142 |
-
<td><br>D</td>
|
143 |
-
<td><br>E</td>
|
144 |
-
<td><br>F</td>
|
145 |
</tr>
|
146 |
</tbody>
|
147 |
</table>
|
@@ -168,40 +155,30 @@ In detail:
|
|
168 |
<tr>
|
169 |
<td rowspan="3"><br>Camembert-base-frenchNER_4entities</td>
|
170 |
<td><br>Precision</td>
|
171 |
-
<td><br>
|
172 |
-
<td><br>
|
173 |
-
<td><br>
|
174 |
-
<td><br>
|
175 |
-
<td><br>
|
176 |
-
<td><br>
|
177 |
</tr>
|
178 |
<tr>
|
179 |
<td><br>Recall</td>
|
180 |
-
<td><br>
|
181 |
-
<td><br>
|
182 |
-
<td><br>
|
183 |
-
<td><br>
|
184 |
-
<td><br>
|
185 |
-
<td><br>
|
186 |
</tr>
|
187 |
<tr>
|
188 |
<td>F1</td>
|
189 |
-
<td><br>
|
190 |
-
<td><br>
|
191 |
-
<td><br>
|
192 |
-
<td><br>
|
193 |
-
<td><br>
|
194 |
-
<td><br>
|
195 |
-
</tr>
|
196 |
-
<tr>
|
197 |
-
<td></td>
|
198 |
-
<td><br>Number</td>
|
199 |
-
<td><br>A</td>
|
200 |
-
<td><br>B</td>
|
201 |
-
<td><br>C</td>
|
202 |
-
<td><br>D</td>
|
203 |
-
<td><br>E</td>
|
204 |
-
<td><br>F</td>
|
205 |
</tr>
|
206 |
</tbody>
|
207 |
</table>
|
@@ -225,40 +202,30 @@ In detail:
|
|
225 |
<tr>
|
226 |
<td rowspan="3"><br>Camembert-base-frenchNER_4entities</td>
|
227 |
<td><br>Precision</td>
|
228 |
-
<td><br>
|
229 |
-
<td><br>
|
230 |
-
<td><br>
|
231 |
-
<td><br>
|
232 |
-
<td><br>
|
233 |
-
<td><br>
|
234 |
</tr>
|
235 |
<tr>
|
236 |
<td><br>Recall</td>
|
237 |
-
<td><br>
|
238 |
-
<td><br>
|
239 |
-
<td><br>
|
240 |
-
<td><br>
|
241 |
-
<td><br>
|
242 |
-
<td><br>
|
243 |
</tr>
|
244 |
<tr>
|
245 |
<td>F1</td>
|
246 |
-
<td><br>
|
247 |
-
<td><br>
|
248 |
-
<td><br>
|
249 |
-
<td><br>
|
250 |
-
<td><br>
|
251 |
-
<td><br>
|
252 |
-
</tr>
|
253 |
-
<tr>
|
254 |
-
<td></td>
|
255 |
-
<td><br>Number</td>
|
256 |
-
<td><br>A</td>
|
257 |
-
<td><br>B</td>
|
258 |
-
<td><br>C</td>
|
259 |
-
<td><br>D</td>
|
260 |
-
<td><br>E</td>
|
261 |
-
<td><br>F</td>
|
262 |
</tr>
|
263 |
</tbody>
|
264 |
</table>
|
@@ -283,40 +250,30 @@ In detail:
|
|
283 |
<tr>
|
284 |
<td rowspan="3"><br>Camembert-base-frenchNER_4entities</td>
|
285 |
<td><br>Precision</td>
|
286 |
-
<td><br>
|
287 |
-
<td><br>
|
288 |
-
<td><br>
|
289 |
-
<td><br>
|
290 |
-
<td><br>
|
291 |
-
<td><br>
|
292 |
</tr>
|
293 |
<tr>
|
294 |
<td><br>Recall</td>
|
295 |
-
<td><br>
|
296 |
-
<td><br>
|
297 |
-
<td><br>
|
298 |
-
<td><br>
|
299 |
-
<td><br>
|
300 |
-
<td><br>
|
301 |
</tr>
|
302 |
<tr>
|
303 |
<td>F1</td>
|
304 |
-
<td><br>
|
305 |
-
<td><br>
|
306 |
-
<td><br>
|
307 |
-
<td><br>
|
308 |
-
<td><br>
|
309 |
-
<td><br>
|
310 |
-
</tr>
|
311 |
-
<tr>
|
312 |
-
<td></td>
|
313 |
-
<td><br>Number</td>
|
314 |
-
<td><br>A</td>
|
315 |
-
<td><br>B</td>
|
316 |
-
<td><br>C</td>
|
317 |
-
<td><br>D</td>
|
318 |
-
<td><br>E</td>
|
319 |
-
<td><br>F</td>
|
320 |
</tr>
|
321 |
</tbody>
|
322 |
</table>
|
@@ -328,16 +285,89 @@ In detail:
|
|
328 |
```python
|
329 |
from transformers import pipeline
|
330 |
|
331 |
-
ner = pipeline('question-answering', model='CATIE-AQ/Camembert-base-frenchNER_4entities', tokenizer='CATIE-AQ/Camembert-base-frenchNER_4entities',
|
332 |
|
333 |
-
|
334 |
"Assurés de disputer l'Euro 2024 en Allemagne l'été prochain (du 14 juin au 14 juillet) depuis leur victoire aux Pays-Bas, les Bleus ont fait le nécessaire pour avoir des certitudes. Avec six victoires en six matchs officiels et un seul but encaissé, Didier Deschamps a consolidé les acquis de la dernière Coupe du monde. Les joueurs clés sont connus : Kylian Mbappé, Aurélien Tchouameni, Antoine Griezmann, Ibrahima Konaté ou encore Mike Maignan."
|
335 |
)
|
336 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
337 |
print(result)
|
338 |
-
```
|
339 |
```python
|
340 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
341 |
```
|
342 |
|
343 |
### Try it through Space
|
|
|
15 |
- fr
|
16 |
widget:
|
17 |
- text: "Assurés de disputer l'Euro 2024 en Allemagne l'été prochain (du 14 juin au 14 juillet) depuis leur victoire aux Pays-Bas, les Bleus ont fait le nécessaire pour avoir des certitudes. Avec six victoires en six matchs officiels et un seul but encaissé, Didier Deschamps a consolidé les acquis de la dernière Coupe du monde. Les joueurs clés sont connus : Kylian Mbappé, Aurélien Tchouameni, Antoine Griezmann, Ibrahima Konaté ou encore Mike Maignan."
|
|
|
|
|
|
|
18 |
library_name: transformers
|
19 |
pipeline_tag: token-classification
|
20 |
co2_eq_emissions: 20
|
|
|
105 |
<tr>
|
106 |
<td rowspan="3"><br>Camembert-base-frenchNER_4entities</td>
|
107 |
<td><br>Precision</td>
|
108 |
+
<td><br>0.973</td>
|
109 |
+
<td><br>0.951</td>
|
110 |
+
<td><br>0.8877</td>
|
111 |
+
<td><br>0.850</td>
|
112 |
+
<td><br>0.993</td>
|
113 |
+
<td><br>0.984</td>
|
114 |
</tr>
|
115 |
<tr>
|
116 |
<td><br>Recall</td>
|
117 |
+
<td><br>0.983</td>
|
118 |
+
<td><br>0.964</td>
|
119 |
+
<td><br>0.918</td>
|
120 |
+
<td><br>0.781</td>
|
121 |
+
<td><br>0.993</td>
|
122 |
+
<td><br>0.984</td>
|
123 |
</tr>
|
124 |
<tr>
|
125 |
<td>F1</td>
|
126 |
+
<td><br>0.978</td>
|
127 |
+
<td><br>0.958</td>
|
128 |
+
<td><br>0.903</td>
|
129 |
+
<td><br>0.814</td>
|
130 |
+
<td><br>0.993</td>
|
131 |
+
<td><br>0.984</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
132 |
</tr>
|
133 |
</tbody>
|
134 |
</table>
|
|
|
155 |
<tr>
|
156 |
<td rowspan="3"><br>Camembert-base-frenchNER_4entities</td>
|
157 |
<td><br>Precision</td>
|
158 |
+
<td><br>0.954</td>
|
159 |
+
<td><br>0.893</td>
|
160 |
+
<td><br>0.851/td>
|
161 |
+
<td><br>0.849</td>
|
162 |
+
<td><br>0.979</td>
|
163 |
+
<td><br>0.954</td>
|
164 |
</tr>
|
165 |
<tr>
|
166 |
<td><br>Recall</td>
|
167 |
+
<td><br>0.967</td>
|
168 |
+
<td><br>0.887/td>
|
169 |
+
<td><br>0.883</td>
|
170 |
+
<td><br>0.855</td>
|
171 |
+
<td><br>0.974</td>
|
172 |
+
<td><br>0.954</td>
|
173 |
</tr>
|
174 |
<tr>
|
175 |
<td>F1</td>
|
176 |
+
<td><br>0.960</td>
|
177 |
+
<td><br>0.890</td>
|
178 |
+
<td><br>0.867</td>
|
179 |
+
<td><br>0.852</td>
|
180 |
+
<td><br>0.977</td>
|
181 |
+
<td><br>0.954</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
182 |
</tr>
|
183 |
</tbody>
|
184 |
</table>
|
|
|
202 |
<tr>
|
203 |
<td rowspan="3"><br>Camembert-base-frenchNER_4entities</td>
|
204 |
<td><br>Precision</td>
|
205 |
+
<td><br>0.976</td>
|
206 |
+
<td><br>0.961</td>
|
207 |
+
<td><br>0.91</td>
|
208 |
+
<td><br>0.829</td>
|
209 |
+
<td><br>0.991</td>
|
210 |
+
<td><br>0.983</td>
|
211 |
</tr>
|
212 |
<tr>
|
213 |
<td><br>Recall</td>
|
214 |
+
<td><br>0.993</td>
|
215 |
+
<td><br>0.985</td>
|
216 |
+
<td><br>0.967</td>
|
217 |
+
<td><br>0.993</td>
|
218 |
+
<td><br>0.719</td>
|
219 |
+
<td><br>0.983</td>
|
220 |
</tr>
|
221 |
<tr>
|
222 |
<td>F1</td>
|
223 |
+
<td><br>0.985</td>
|
224 |
+
<td><br>0.973</td>
|
225 |
+
<td><br>0.938</td>
|
226 |
+
<td><br>0.770</td>
|
227 |
+
<td><br>0.992</td>
|
228 |
+
<td><br>0.983</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
229 |
</tr>
|
230 |
</tbody>
|
231 |
</table>
|
|
|
250 |
<tr>
|
251 |
<td rowspan="3"><br>Camembert-base-frenchNER_4entities</td>
|
252 |
<td><br>Precision</td>
|
253 |
+
<td><br>0.970</td>
|
254 |
+
<td><br>0.944</td>
|
255 |
+
<td><br>0.872</td>
|
256 |
+
<td><br>0.878</td>
|
257 |
+
<td><br>0.996</td>
|
258 |
+
<td><br>0.986</td>
|
259 |
</tr>
|
260 |
<tr>
|
261 |
<td><br>Recall</td>
|
262 |
+
<td><br>0.969</td>
|
263 |
+
<td><br>0.947</td>
|
264 |
+
<td><br>0.880</td>
|
265 |
+
<td><br>0.866</td>
|
266 |
+
<td><br>0.996</td>
|
267 |
+
<td><br>0.986</td>
|
268 |
</tr>
|
269 |
<tr>
|
270 |
<td>F1</td>
|
271 |
+
<td><br>0.970</td>
|
272 |
+
<td><br>0.945</td>
|
273 |
+
<td><br>0.876</td>
|
274 |
+
<td><br>0.872</td>
|
275 |
+
<td><br>0.996</td>
|
276 |
+
<td><br>0.986</td>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
277 |
</tr>
|
278 |
</tbody>
|
279 |
</table>
|
|
|
285 |
```python
|
286 |
from transformers import pipeline
|
287 |
|
288 |
+
ner = pipeline('question-answering', model='CATIE-AQ/Camembert-base-frenchNER_4entities', tokenizer='CATIE-AQ/Camembert-base-frenchNER_4entities', aggregation_strategy="simple")
|
289 |
|
290 |
+
results = ner(
|
291 |
"Assurés de disputer l'Euro 2024 en Allemagne l'été prochain (du 14 juin au 14 juillet) depuis leur victoire aux Pays-Bas, les Bleus ont fait le nécessaire pour avoir des certitudes. Avec six victoires en six matchs officiels et un seul but encaissé, Didier Deschamps a consolidé les acquis de la dernière Coupe du monde. Les joueurs clés sont connus : Kylian Mbappé, Aurélien Tchouameni, Antoine Griezmann, Ibrahima Konaté ou encore Mike Maignan."
|
292 |
)
|
293 |
|
294 |
+
|
295 |
+
# Note : the aggregation_strategy parameter does not return the results as expected, so we need to do some post-processing
|
296 |
+
dict_to_del = []
|
297 |
+
for idx in range(len(results)-1):
|
298 |
+
if results[idx]["end"] == results[idx+1]["start"]:
|
299 |
+
results[idx+1]["word"] = results[idx]["word"]+results[idx+1]["word"]
|
300 |
+
results[idx+1]["score"] = (results[idx]["score"]+results[idx+1]["score"])/2
|
301 |
+
results[idx+1]["start"] = results[idx]["start"]
|
302 |
+
dict_to_del.append(idx)
|
303 |
+
results = [j for i, j in enumerate(results) if i not in dict_to_del]
|
304 |
+
|
305 |
+
dict_to_del = []
|
306 |
+
for i in range(len(to_print)-1):
|
307 |
+
if (to_print[i]["end"] == to_print[i+1]["start"]-1):
|
308 |
+
to_print[i+1]["word"] = to_print[i]["word"]+" "+to_print[i+1]["word"]
|
309 |
+
to_print[i+1]["score"] = (to_print[i]["score"]+to_print[i+1]["score"])/2
|
310 |
+
to_print[i+1]["start"] = to_print[i]["start"]
|
311 |
+
dict_to_del.append(i)
|
312 |
+
to_print = [j for i, j in enumerate(to_print) if i not in dict_to_del]
|
313 |
+
|
314 |
print(result)
|
|
|
315 |
```python
|
316 |
+
[{'entity_group': 'MISC',
|
317 |
+
'score': 0.9404951632022858,
|
318 |
+
'word': 'Euro 2024',
|
319 |
+
'start': 22,
|
320 |
+
'end': 31},
|
321 |
+
{'entity_group': 'LOC',
|
322 |
+
'score': 0.96980727,
|
323 |
+
'word': 'Allemagne',
|
324 |
+
'start': 35,
|
325 |
+
'end': 44},
|
326 |
+
{'entity_group': 'LOC',
|
327 |
+
'score': 0.8612850904464722,
|
328 |
+
'word': 'Pays-Bas',
|
329 |
+
'start': 112,
|
330 |
+
'end': 120},
|
331 |
+
{'entity_group': 'ORG',
|
332 |
+
'score': 0.8148028254508972,
|
333 |
+
'word': 'les Bleus',
|
334 |
+
'start': 122,
|
335 |
+
'end': 131},
|
336 |
+
{'entity_group': 'PER',
|
337 |
+
'score': 0.9994482398033142,
|
338 |
+
'word': 'Didier Deschamps',
|
339 |
+
'start': 250,
|
340 |
+
'end': 266},
|
341 |
+
{'entity_group': 'MISC',
|
342 |
+
'score': 0.84807388484478,
|
343 |
+
'word': 'dernière Coupe du monde',
|
344 |
+
'start': 296,
|
345 |
+
'end': 319},
|
346 |
+
{'entity_group': 'PER',
|
347 |
+
'score': 0.9996860176324844,
|
348 |
+
'word': 'Kylian Mbappé',
|
349 |
+
'start': 352,
|
350 |
+
'end': 365},
|
351 |
+
{'entity_group': 'PER',
|
352 |
+
'score': 0.9996881932020187,
|
353 |
+
'word': 'Aurélien Tchouameni',
|
354 |
+
'start': 367,
|
355 |
+
'end': 386},
|
356 |
+
{'entity_group': 'PER',
|
357 |
+
'score': 0.9996924996376038,
|
358 |
+
'word': 'Antoine Griezmann',
|
359 |
+
'start': 388,
|
360 |
+
'end': 405},
|
361 |
+
{'entity_group': 'PER',
|
362 |
+
'score': 0.9996860027313232,
|
363 |
+
'word': 'Ibrahima Konaté',
|
364 |
+
'start': 407,
|
365 |
+
'end': 422},
|
366 |
+
{'entity_group': 'PER',
|
367 |
+
'score': 0.9996623992919922,
|
368 |
+
'word': 'Mike Maignan',
|
369 |
+
'start': 433,
|
370 |
+
'end': 445}]
|
371 |
```
|
372 |
|
373 |
### Try it through Space
|