add per bucket std description
Browse files
README.md
CHANGED
|
@@ -358,13 +358,14 @@ The [notebook](https://huggingface.co/spaces/HuggingFaceM4/m4-bias-eval/blob/mai
|
|
| 358 |
|
| 359 |
Besides, we also computed the classification accuracy on FairFace for both the base and instructed models:
|
| 360 |
|
| 361 |
-
| Model | Shots | <nobr>FairFaceGender<br>acc. (std)</nobr> | <nobr>FairFaceRace<br>acc. (std)</nobr> | <nobr>FairFaceAge<br>acc. (std)</nobr> |
|
| 362 |
| :--------------------- | --------: | ----------------------------: | --------------------------: | -------------------------: |
|
| 363 |
| IDEFICS 80B | 0 | 95.8 (1.0) | 64.1 (16.1) | 51.0 (2.9) |
|
| 364 |
| IDEFICS 9B | 0 | 94.4 (2.2) | 55.3 (13.0) | 45.1 (2.9) |
|
| 365 |
| IDEFICS 80B Instruct | 0 | 95.7 (2.4) | 63.4 (25.6) | 47.1 (2.9) |
|
| 366 |
| IDEFICS 9B Instruct | 0 | 92.7 (6.3) | 59.6 (22.2) | 43.9 (3.9) |
|
| 367 |
|
|
|
|
| 368 |
## Other limitations
|
| 369 |
|
| 370 |
- The model currently will offer medical diagnosis when prompted to do so. For example, the prompt `Does this X-ray show any medical problems?` along with an image of a chest X-ray returns `Yes, the X-ray shows a medical problem, which appears to be a collapsed lung.`
|
|
|
|
| 358 |
|
| 359 |
Besides, we also computed the classification accuracy on FairFace for both the base and instructed models:
|
| 360 |
|
| 361 |
+
| Model | Shots | <nobr>FairFaceGender<br>acc. (std*)</nobr> | <nobr>FairFaceRace<br>acc. (std*)</nobr> | <nobr>FairFaceAge<br>acc. (std*)</nobr> |
|
| 362 |
| :--------------------- | --------: | ----------------------------: | --------------------------: | -------------------------: |
|
| 363 |
| IDEFICS 80B | 0 | 95.8 (1.0) | 64.1 (16.1) | 51.0 (2.9) |
|
| 364 |
| IDEFICS 9B | 0 | 94.4 (2.2) | 55.3 (13.0) | 45.1 (2.9) |
|
| 365 |
| IDEFICS 80B Instruct | 0 | 95.7 (2.4) | 63.4 (25.6) | 47.1 (2.9) |
|
| 366 |
| IDEFICS 9B Instruct | 0 | 92.7 (6.3) | 59.6 (22.2) | 43.9 (3.9) |
|
| 367 |
|
| 368 |
+
*Per bucket standard deviation. Each bucket represents a combination of race and gender from the [FairFace](https://huggingface.co/datasets/HuggingFaceM4/FairFace) dataset.
|
| 369 |
## Other limitations
|
| 370 |
|
| 371 |
- The model currently will offer medical diagnosis when prompted to do so. For example, the prompt `Does this X-ray show any medical problems?` along with an image of a chest X-ray returns `Yes, the X-ray shows a medical problem, which appears to be a collapsed lung.`
|