nielsr HF staff commited on
Commit
6378f75
·
verified ·
1 Parent(s): 24e29d0

Add pipeline tag and license to model card, link to code

Browse files

This PR adds the missing `pipeline_tag` and `license` to the model card metadata. The `pipeline_tag` is set to `image-text-to-text` based on the model's functionality as described in the abstract and usage example.
It also adds a link to the github repository so it's easier for people to use the model.

Files changed (1) hide show
  1. README.md +17 -10
README.md CHANGED
@@ -1,6 +1,8 @@
1
  ---
2
  library_name: transformers
3
  tags: []
 
 
4
  ---
5
 
6
  # Fine-Grained Visual Classification on HAM10000
@@ -9,16 +11,15 @@ Project Page: [SelfSynthX](https://github.com/sycny/SelfSynthX).
9
 
10
  Paper on arXiv: [Enhancing Cognition and Explainability of Multimodal Foundation Models with Self-Synthesized Data](https://arxiv.org/abs/2502.14044)
11
 
12
-
13
- This model is a fine-tuned multimodal foundation model developed on the [LLaVA-1.5-7B-hf](https://huggingface.co/llava-hf/llava-1.5-7b-hf) base, optimized for fine-grained skin lesion classification and explainability using the HAM10000 dataset.
14
 
15
  ## Key Details
16
 
17
- - **Base Model:** LLaVA-1.5-7B
18
  - **Dataset:** HAM10000
19
- - **Innovation:**
20
- - **Self-Synthesized Data:** Generates interpretable explanations by extracting lesion-specific visual concepts using the Information Bottleneck principle.
21
- - **Iterative Fine-Tuning:** Uses reward model-free rejection sampling to progressively improve classification accuracy and explanation quality.
22
  - **Intended Use:** Skin lesion classification with human-verifiable explanations for dermatological analysis.
23
 
24
  ## How to Use
@@ -31,8 +32,8 @@ from transformers import AutoProcessor, LlavaForConditionalGeneration
31
 
32
  model_id = "YuchengShi/LLaVA-v1.5-7B-HAM10000"
33
  model = LlavaForConditionalGeneration.from_pretrained(
34
- model_id,
35
- torch_dtype=torch.float16,
36
  low_cpu_mem_usage=True,
37
  ).to("cuda")
38
  processor = AutoProcessor.from_pretrained(model_id)
@@ -57,7 +58,7 @@ print(processor.decode(output[0][2:], skip_special_tokens=True))
57
 
58
  ## Training & Evaluation
59
 
60
- - **Training:** Fine-tuned using LoRA on HAM10000 with iterative rejection sampling.
61
  - **Evaluation:** Demonstrates higher accuracy and robust, interpretable explanations compared to baseline models.
62
 
63
  ## Citation
@@ -73,4 +74,10 @@ If you use this model, please cite:
73
  year={2025},
74
  url={https://openreview.net/forum?id=lHbLpwbEyt}
75
  }
76
- ```
 
 
 
 
 
 
 
1
  ---
2
  library_name: transformers
3
  tags: []
4
+ pipeline_tag: image-text-to-text
5
+ license: mit
6
  ---
7
 
8
  # Fine-Grained Visual Classification on HAM10000
 
11
 
12
  Paper on arXiv: [Enhancing Cognition and Explainability of Multimodal Foundation Models with Self-Synthesized Data](https://arxiv.org/abs/2502.14044)
13
 
14
+ This model is a fine-tuned multimodal foundation model developed on the [LLaVA-1.5-7B-hf](https://huggingface.co/llava-hf/llava-1.5-7B-hf) base, optimized for fine-grained skin lesion classification and explainability using the HAM10000 dataset.
 
15
 
16
  ## Key Details
17
 
18
+ - **Base Model:** LLaVA-1.5-7B
19
  - **Dataset:** HAM10000
20
+ - **Innovation:**
21
+ - **Self-Synthesized Data:** Generates interpretable explanations by extracting lesion-specific visual concepts using the Information Bottleneck principle.
22
+ - **Iterative Fine-Tuning:** Uses reward model-free rejection sampling to progressively improve classification accuracy and explanation quality.
23
  - **Intended Use:** Skin lesion classification with human-verifiable explanations for dermatological analysis.
24
 
25
  ## How to Use
 
32
 
33
  model_id = "YuchengShi/LLaVA-v1.5-7B-HAM10000"
34
  model = LlavaForConditionalGeneration.from_pretrained(
35
+ model_id,
36
+ torch_dtype=torch.float16,
37
  low_cpu_mem_usage=True,
38
  ).to("cuda")
39
  processor = AutoProcessor.from_pretrained(model_id)
 
58
 
59
  ## Training & Evaluation
60
 
61
+ - **Training:** Fine-tuned using LoRA on HAM10000 with iterative rejection sampling.
62
  - **Evaluation:** Demonstrates higher accuracy and robust, interpretable explanations compared to baseline models.
63
 
64
  ## Citation
 
74
  year={2025},
75
  url={https://openreview.net/forum?id=lHbLpwbEyt}
76
  }
77
+ ```
78
+
79
+ ## Contact
80
+
81
+ For any questions, suggestions, or issues, please open an issue on GitHub or contact us at [[email protected]](mailto:[email protected]).
82
+
83
+ Github repository: https://github.com/sycny/SelfSynthX