ritvik77
/

Medical_Doctor_AI_LoRA-Mistral-7B-Instruct_FullModel

@@ -19,14 +19,12 @@ tags:
 ---
 # Model Card for Model ID
-<!-- 🩺 Medical Diagnosis AI Model - Powered by Mistral-7B & LoRA 🚀
 🔹 Model Overview:
 Base Model: Mistral-7B (7.7 billion parameters)
 Fine-Tuning Method: LoRA (Low-Rank Adaptation)
 Quantization: bnb_4bit (reduces memory footprint while retaining performance)
 🔹 Parameter Details:
 Original Mistral-7B Parameters: 7.7 billion
 LoRA Fine-Tuned Parameters: ~4.48% of total model parameters (~340 million)
 Final Merged Model Size (bnb_4bit Quantized): ~4.5GB
@@ -37,14 +35,13 @@ Final Merged Model Size (bnb_4bit Quantized): ~4.5GB
 🔹 Use Case:
 Designed to assist healthcare professionals by offering clear, evidence-backed insights for improved clinical decision-making.
-📝 Note: While this model offers valuable insights, it's intended to support — not replace — professional medical judgment. -->
 This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
 ## Model Details
 Original Mistral-7B Parameters: 7.7 billion
-LoRA Fine-Tuned Parameters: ~4.48% of total model parameters (~340 million)
 Final Merged Model Size (bnb_4bit Quantized): ~4.5GB
 🔹 Key Features:
 ✅ Accurate Diagnoses for symptoms like chest pain, dizziness, and breathlessness
@@ -52,10 +49,10 @@ Final Merged Model Size (bnb_4bit Quantized): ~4.5GB
 ✅ Efficient Inference with reduced VRAM usage (ideal for GPUs with limited memory)
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
 - **Developed by:** [Ritvik Gaur]
 - **Funded by [optional]:** [More Information Needed]
@@ -74,18 +71,6 @@ Final Merged Model Size (bnb_4bit Quantized): ~4.5GB
 - **Demo [optional]:** [More Information Needed]
 ## Uses
-Python code for usage:
-from transformers import AutoModelForCausalLM, AutoTokenizer
-# ✅ Load the uploaded model
-model = AutoModelForCausalLM.from_pretrained("ritvik77/Medical_Doctor_AI_LoRA-Mistral-7B-Instruct")
-tokenizer = AutoTokenizer.from_pretrained("ritvik77/Medical_Doctor_AI_LoRA-Mistral-7B-Instruct")
-# ✅ Sample inference
-prompt = "Patient reports chest pain and dizziness. What’s the likely diagnosis?"
-inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
-outputs = model.generate(**inputs, max_new_tokens=300)
-print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ### Direct Use
@@ -106,8 +91,7 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 [More Information Needed]
 ## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
 [More Information Needed]
@@ -121,6 +105,18 @@ Users (both direct and downstream) should be made aware of the risks, biases and
 Use the code below to get started with the model.
 [More Information Needed]
 ## Training Details
@@ -142,7 +138,31 @@ Use the code below to get started with the model.
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 #### Speeds, Sizes, Times [optional]

 ---
 # Model Card for Model ID
+ 🩺 Medical Diagnosis AI Model - Powered by Mistral-7B & LoRA 🚀
 🔹 Model Overview:
 Base Model: Mistral-7B (7.7 billion parameters)
 Fine-Tuning Method: LoRA (Low-Rank Adaptation)
 Quantization: bnb_4bit (reduces memory footprint while retaining performance)
 🔹 Parameter Details:
 Original Mistral-7B Parameters: 7.7 billion
 LoRA Fine-Tuned Parameters: ~4.48% of total model parameters (~340 million)
 Final Merged Model Size (bnb_4bit Quantized): ~4.5GB
 🔹 Use Case:
 Designed to assist healthcare professionals by offering clear, evidence-backed insights for improved clinical decision-making.
+📝 Note: While this model offers valuable insights, it's intended to support — not replace — professional medical judgment.
 This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
 ## Model Details
 Original Mistral-7B Parameters: 7.7 billion
+LoRA Fine-Tuned Parameters: 4.48% of total model parameters (~340 million)
 Final Merged Model Size (bnb_4bit Quantized): ~4.5GB
 🔹 Key Features:
 ✅ Accurate Diagnoses for symptoms like chest pain, dizziness, and breathlessness
 ✅ Efficient Inference with reduced VRAM usage (ideal for GPUs with limited memory)
 ### Model Description
+This model leverages the powerful Mistral-7B language model, known for its strong reasoning capabilities and deep language understanding. Through LoRA fine-tuning, the model now excels in medical-specific tasks like:
+✅ Diagnosing conditions from symptoms such as chest pain, dizziness, and shortness of breath
+✅ Providing detailed, step-by-step medical reasoning using Chain-of-Thought (CoT) prompting
+✅ Generating confident, evidence-backed answers with improved precision
 - **Developed by:** [Ritvik Gaur]
 - **Funded by [optional]:** [More Information Needed]
 - **Demo [optional]:** [More Information Needed]
 ## Uses
 ### Direct Use
 [More Information Needed]
 ## Bias, Risks, and Limitations
+Please dont fully rely on this model for real life illness, this model is just for support of real verifies health applications that requires LLM.
 [More Information Needed]
 Use the code below to get started with the model.
+Python code for usage:
+from transformers import AutoModelForCausalLM, AutoTokenizer
+# ✅ Load the uploaded model
+model = AutoModelForCausalLM.from_pretrained("ritvik77/Medical_Doctor_AI_LoRA-Mistral-7B-Instruct")
+tokenizer = AutoTokenizer.from_pretrained("ritvik77/Medical_Doctor_AI_LoRA-Mistral-7B-Instruct")
+# ✅ Sample inference
+prompt = "Patient reports chest pain and dizziness. What’s the likely diagnosis?"
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=300)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 [More Information Needed]
 ## Training Details
 #### Training Hyperparameters
+- **Training regime:** [More Information Needed]
+-
+Parameter	Value	Description
+Base Model	mistralai/Mistral-7B-Instruct	Chosen for its strong reasoning capabilities
+Fine-Tuning Framework	LoRA (Low-Rank Adaptation)	Efficiently fine-tuned only ~4.48% of total parameters
+Quantization	bnb_4bit	Enabled for reduced VRAM consumption
+Train Batch Size	12	Optimized to balance GPU utilization and convergence
+Eval Batch Size	12	Matches training batch size to ensure stable evaluation
+Gradient Accumulation Steps	3	Effective batch size = 36 for improved stability
+Learning Rate	3e-5	Lowered to ensure smoother convergence
+Warmup Ratio	0.2	Gradual learning rate ramp-up for improved stability
+Scheduler Type	Cosine	Ensures smooth and controlled learning rate decay
+Number of Epochs	5	Balanced to ensure convergence without overfitting
+Max Gradient Norm	0.5	Prevents exploding gradients
+Weight Decay	0.08	Regularization for improved generalization
+bf16 Precision	True	Maximizes GPU utilization and precision
+Gradient Checkpointing	Enabled	Reduces memory usage during training
+🔎 LoRA Configuration
+Parameter	Value	Description
+Rank Dimension	128	Balanced for strong expressiveness without excessive memory overhead
+LoRA Alpha	128	Ensures stable gradient updates
+LoRA Dropout	0.1	Helps prevent overfitting
 #### Speeds, Sizes, Times [optional]