ritvik77 commited on
Commit
3a5eca9
Β·
verified Β·
1 Parent(s): 582ef09

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -25
README.md CHANGED
@@ -19,14 +19,12 @@ tags:
19
  ---
20
  # Model Card for Model ID
21
 
22
- <!-- 🩺 Medical Diagnosis AI Model - Powered by Mistral-7B & LoRA πŸš€
23
  πŸ”Ή Model Overview:
24
-
25
  Base Model: Mistral-7B (7.7 billion parameters)
26
  Fine-Tuning Method: LoRA (Low-Rank Adaptation)
27
  Quantization: bnb_4bit (reduces memory footprint while retaining performance)
28
  πŸ”Ή Parameter Details:
29
-
30
  Original Mistral-7B Parameters: 7.7 billion
31
  LoRA Fine-Tuned Parameters: ~4.48% of total model parameters (~340 million)
32
  Final Merged Model Size (bnb_4bit Quantized): ~4.5GB
@@ -37,14 +35,13 @@ Final Merged Model Size (bnb_4bit Quantized): ~4.5GB
37
 
38
  πŸ”Ή Use Case:
39
  Designed to assist healthcare professionals by offering clear, evidence-backed insights for improved clinical decision-making.
40
-
41
- πŸ“ Note: While this model offers valuable insights, it's intended to support β€” not replace β€” professional medical judgment. -->
42
 
43
  This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
44
 
45
  ## Model Details
46
  Original Mistral-7B Parameters: 7.7 billion
47
- LoRA Fine-Tuned Parameters: ~4.48% of total model parameters (~340 million)
48
  Final Merged Model Size (bnb_4bit Quantized): ~4.5GB
49
  πŸ”Ή Key Features:
50
  βœ… Accurate Diagnoses for symptoms like chest pain, dizziness, and breathlessness
@@ -52,10 +49,10 @@ Final Merged Model Size (bnb_4bit Quantized): ~4.5GB
52
  βœ… Efficient Inference with reduced VRAM usage (ideal for GPUs with limited memory)
53
 
54
  ### Model Description
55
-
56
- <!-- Provide a longer summary of what this model is. -->
57
-
58
-
59
 
60
  - **Developed by:** [Ritvik Gaur]
61
  - **Funded by [optional]:** [More Information Needed]
@@ -74,18 +71,6 @@ Final Merged Model Size (bnb_4bit Quantized): ~4.5GB
74
  - **Demo [optional]:** [More Information Needed]
75
 
76
  ## Uses
77
- Python code for usage:
78
-
79
- from transformers import AutoModelForCausalLM, AutoTokenizer
80
- # βœ… Load the uploaded model
81
- model = AutoModelForCausalLM.from_pretrained("ritvik77/Medical_Doctor_AI_LoRA-Mistral-7B-Instruct")
82
- tokenizer = AutoTokenizer.from_pretrained("ritvik77/Medical_Doctor_AI_LoRA-Mistral-7B-Instruct")
83
- # βœ… Sample inference
84
- prompt = "Patient reports chest pain and dizziness. What’s the likely diagnosis?"
85
- inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
86
-
87
- outputs = model.generate(**inputs, max_new_tokens=300)
88
- print(tokenizer.decode(outputs[0], skip_special_tokens=True))
89
 
90
  ### Direct Use
91
 
@@ -106,8 +91,7 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
106
  [More Information Needed]
107
 
108
  ## Bias, Risks, and Limitations
109
-
110
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
111
 
112
  [More Information Needed]
113
 
@@ -121,6 +105,18 @@ Users (both direct and downstream) should be made aware of the risks, biases and
121
 
122
  Use the code below to get started with the model.
123
 
 
 
 
 
 
 
 
 
 
 
 
 
124
  [More Information Needed]
125
 
126
  ## Training Details
@@ -142,7 +138,31 @@ Use the code below to get started with the model.
142
 
143
  #### Training Hyperparameters
144
 
145
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
146
 
147
  #### Speeds, Sizes, Times [optional]
148
 
 
19
  ---
20
  # Model Card for Model ID
21
 
22
+ 🩺 Medical Diagnosis AI Model - Powered by Mistral-7B & LoRA πŸš€
23
  πŸ”Ή Model Overview:
 
24
  Base Model: Mistral-7B (7.7 billion parameters)
25
  Fine-Tuning Method: LoRA (Low-Rank Adaptation)
26
  Quantization: bnb_4bit (reduces memory footprint while retaining performance)
27
  πŸ”Ή Parameter Details:
 
28
  Original Mistral-7B Parameters: 7.7 billion
29
  LoRA Fine-Tuned Parameters: ~4.48% of total model parameters (~340 million)
30
  Final Merged Model Size (bnb_4bit Quantized): ~4.5GB
 
35
 
36
  πŸ”Ή Use Case:
37
  Designed to assist healthcare professionals by offering clear, evidence-backed insights for improved clinical decision-making.
38
+ πŸ“ Note: While this model offers valuable insights, it's intended to support β€” not replace β€” professional medical judgment.
 
39
 
40
  This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
41
 
42
  ## Model Details
43
  Original Mistral-7B Parameters: 7.7 billion
44
+ LoRA Fine-Tuned Parameters: 4.48% of total model parameters (~340 million)
45
  Final Merged Model Size (bnb_4bit Quantized): ~4.5GB
46
  πŸ”Ή Key Features:
47
  βœ… Accurate Diagnoses for symptoms like chest pain, dizziness, and breathlessness
 
49
  βœ… Efficient Inference with reduced VRAM usage (ideal for GPUs with limited memory)
50
 
51
  ### Model Description
52
+ This model leverages the powerful Mistral-7B language model, known for its strong reasoning capabilities and deep language understanding. Through LoRA fine-tuning, the model now excels in medical-specific tasks like:
53
+ βœ… Diagnosing conditions from symptoms such as chest pain, dizziness, and shortness of breath
54
+ βœ… Providing detailed, step-by-step medical reasoning using Chain-of-Thought (CoT) prompting
55
+ βœ… Generating confident, evidence-backed answers with improved precision
56
 
57
  - **Developed by:** [Ritvik Gaur]
58
  - **Funded by [optional]:** [More Information Needed]
 
71
  - **Demo [optional]:** [More Information Needed]
72
 
73
  ## Uses
 
 
 
 
 
 
 
 
 
 
 
 
74
 
75
  ### Direct Use
76
 
 
91
  [More Information Needed]
92
 
93
  ## Bias, Risks, and Limitations
94
+ Please dont fully rely on this model for real life illness, this model is just for support of real verifies health applications that requires LLM.
 
95
 
96
  [More Information Needed]
97
 
 
105
 
106
  Use the code below to get started with the model.
107
 
108
+ Python code for usage:
109
+ from transformers import AutoModelForCausalLM, AutoTokenizer
110
+ # βœ… Load the uploaded model
111
+ model = AutoModelForCausalLM.from_pretrained("ritvik77/Medical_Doctor_AI_LoRA-Mistral-7B-Instruct")
112
+ tokenizer = AutoTokenizer.from_pretrained("ritvik77/Medical_Doctor_AI_LoRA-Mistral-7B-Instruct")
113
+ # βœ… Sample inference
114
+ prompt = "Patient reports chest pain and dizziness. What’s the likely diagnosis?"
115
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
116
+ outputs = model.generate(**inputs, max_new_tokens=300)
117
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
118
+
119
+
120
  [More Information Needed]
121
 
122
  ## Training Details
 
138
 
139
  #### Training Hyperparameters
140
 
141
+ - **Training regime:** [More Information Needed]
142
+ -
143
+ Parameter Value Description
144
+ Base Model mistralai/Mistral-7B-Instruct Chosen for its strong reasoning capabilities
145
+ Fine-Tuning Framework LoRA (Low-Rank Adaptation) Efficiently fine-tuned only ~4.48% of total parameters
146
+ Quantization bnb_4bit Enabled for reduced VRAM consumption
147
+ Train Batch Size 12 Optimized to balance GPU utilization and convergence
148
+ Eval Batch Size 12 Matches training batch size to ensure stable evaluation
149
+ Gradient Accumulation Steps 3 Effective batch size = 36 for improved stability
150
+ Learning Rate 3e-5 Lowered to ensure smoother convergence
151
+ Warmup Ratio 0.2 Gradual learning rate ramp-up for improved stability
152
+ Scheduler Type Cosine Ensures smooth and controlled learning rate decay
153
+ Number of Epochs 5 Balanced to ensure convergence without overfitting
154
+ Max Gradient Norm 0.5 Prevents exploding gradients
155
+ Weight Decay 0.08 Regularization for improved generalization
156
+ bf16 Precision True Maximizes GPU utilization and precision
157
+ Gradient Checkpointing Enabled Reduces memory usage during training
158
+
159
+
160
+ πŸ”Ž LoRA Configuration
161
+ Parameter Value Description
162
+ Rank Dimension 128 Balanced for strong expressiveness without excessive memory overhead
163
+ LoRA Alpha 128 Ensures stable gradient updates
164
+ LoRA Dropout 0.1 Helps prevent overfitting
165
+
166
 
167
  #### Speeds, Sizes, Times [optional]
168