prithivMLmods
/

deepfake-detector-model-v1

@@ -7,135 +7,91 @@ tags:
 - ViT
 - detection
 - Image
-- transformers-4.49.0.dev0
 base_model:
 - google/vit-base-patch16-224-in21k
 ---
-![df[ViT].gif](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/Xbuv-x40-l3QjzWu5Yj2F.gif)
-# **Deep-Fake-Detector-Model**
-# **Overview**
-The **Deep-Fake-Detector-Model** is a state-of-the-art deep learning model designed to detect deepfake images. It leverages the **Vision Transformer (ViT)** architecture, specifically the `google/vit-base-patch16-224-in21k` model, fine-tuned on a dataset of real and deepfake images. The model is trained to classify images as either "Real" or "Fake" with high accuracy, making it a powerful tool for detecting manipulated media.
-**<span style="color:red;">Update :</span>** The previous model checkpoint was obtained using a smaller classification dataset. Although it performed well in evaluation scores, its real-time performance was average due to limited variations in the training set. The new update includes a larger dataset to improve the detection of fake images.
-| Repository | Link |
-|------------|------|
-| Deep Fake Detector Model | [GitHub Repository](https://github.com/PRITHIVSAKTHIUR/Deep-Fake-Detector-Model) |
-# **Key Features**
-- **Architecture**: Vision Transformer (ViT) - `google/vit-base-patch16-224-in21k`.
-- **Input**: RGB images resized to 224x224 pixels.
-- **Output**: Binary classification ("Real" or "Fake").
-- **Training Dataset**: A curated dataset of real and deepfake images.
-- **Fine-Tuning**: The model is fine-tuned using Hugging Face's `Trainer` API with advanced data augmentation techniques.
-- **Performance**: Achieves high accuracy and F1 score on validation and test datasets.
-# **Model Architecture**
-The model is based on the **Vision Transformer (ViT)**, which treats images as sequences of patches and applies a transformer encoder to learn spatial relationships. Key components include:
-- **Patch Embedding**: Divides the input image into fixed-size patches (16x16 pixels).
-- **Transformer Encoder**: Processes patch embeddings using multi-head self-attention mechanisms.
-- **Classification Head**: A fully connected layer for binary classification.
-# **Training Details**
-- **Optimizer**: AdamW with a learning rate of `1e-6`.
-- **Batch Size**: 32 for training, 8 for evaluation.
-- **Epochs**: 2.
-- **Data Augmentation**:
-  - Random rotation (±90 degrees).
-  - Random sharpness adjustment.
-  - Random resizing and cropping.
-- **Loss Function**: Cross-Entropy Loss.
-- **Evaluation Metrics**: Accuracy, F1 Score, and Confusion Matrix.
-# **Inference with Hugging Face Pipeline**
-```python
-from transformers import pipeline
-# Load the model
-pipe = pipeline('image-classification', model="prithivMLmods/Deep-Fake-Detector-Model", device=0)
-# Predict on an image
-result = pipe("path_to_image.jpg")
-print(result)
-```
-# **Inference with PyTorch**
 ```python
-from transformers import ViTForImageClassification, ViTImageProcessor
 from PIL import Image
 import torch
-# Load the model and processor
-model = ViTForImageClassification.from_pretrained("prithivMLmods/Deep-Fake-Detector-Model")
-processor = ViTImageProcessor.from_pretrained("prithivMLmods/Deep-Fake-Detector-Model")
-# Load and preprocess the image
-image = Image.open("path_to_image.jpg").convert("RGB")
-inputs = processor(images=image, return_tensors="pt")
-# Perform inference
-with torch.no_grad():
-    outputs = model(**inputs)
-    logits = outputs.logits
-    predicted_class = torch.argmax(logits, dim=1).item()
-# Map class index to label
-label = model.config.id2label[predicted_class]
-print(f"Predicted Label: {label}")
-```
-# **Performance Metrics**
-```
-Classification report:
-              precision    recall  f1-score   support
-        Real     0.6276    0.9823    0.7659     38054
-        Fake     0.9594    0.4176    0.5819     38080
-    accuracy                         0.6999     76134
-   macro avg     0.7935    0.7000    0.6739     76134
-weighted avg     0.7936    0.6999    0.6739     76134
-```
-![Untitled.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/MoxwukbZZZuVpvXHstxsw.png)
-- **Confusion Matrix**:
-  ```
-  [[True Positives, False Negatives],
-   [False Positives, True Negatives]]
-  ```
-# **Dataset**
-The model is fine-tuned on the dataset, which contains:
-- **Real Images**: Authentic images of human faces.
-- **Fake Images**: Deepfake images generated using advanced AI techniques.
-# **Limitations**
-The model is trained on a specific dataset and may not generalize well to other deepfake datasets or domains.
-- Performance may degrade on low-resolution or heavily compressed images.
-- The model is designed for image classification and does not detect deepfake videos directly.
-# **Ethical Considerations**
-**Misuse**: This model should not be used for malicious purposes, such as creating or spreading deepfakes.
-**Bias**: The model may inherit biases from the training dataset. Care should be taken to ensure fairness and inclusivity.
-**Transparency**: Users should be informed when deepfake detection tools are used to analyze their content.
-# **Future Work**
-- Extend the model to detect deepfake videos.
-- Improve generalization by training on larger and more diverse datasets.
-- Incorporate explainability techniques to provide insights into model predictions.
-# **Citation**
-```bibtex
-@misc{Deep-Fake-Detector-Model,
-  author = {prithivMLmods},
-  title = {Deep-Fake-Detector-Model},
-  initial = {21 Mar 2024},
-  last_updated = {31 Jan 2025}
-}

 - ViT
 - detection
 - Image
 base_model:
 - google/vit-base-patch16-224-in21k
+datasets:
+- prithivMLmods/OpenDeepfake-Preview
+language:
+- en
 ---
+# deepfake-detector-model
+> deepfake-detector-model is a vision-language model fine-tuned from `google/vit-base-patch16-224-in21k` for binary image classification. It is trained to detect whether an image is fake or real using the *OpenDeepfake-Preview* dataset. The model uses the `ViTForImageClassification` architecture.
+---
+## Label Space: 2 Classes
+The model classifies an image as either:
+```
+Class 0: fake
+Class 1: real
+```
+---
+## Install Dependencies
+```bash
+pip install -q transformers torch pillow gradio hf_xet
+```
+---
+## Inference Code
 ```python
+import gradio as gr
+from transformers import ViTImageProcessor, ViTForImageClassification
 from PIL import Image
 import torch
+# Load model and processor
+model_name = "your-username/deepfake-detector-model"
+model = ViTForImageClassification.from_pretrained(model_name)
+processor = ViTImageProcessor.from_pretrained(model_name)
+# Updated label mapping
+labels_list = ['fake', 'real']
+def classify_image(image):
+    image = Image.fromarray(image).convert("RGB")
+    inputs = processor(images=image, return_tensors="pt")
+    with torch.no_grad():
+        outputs = model(**inputs)
+        logits = outputs.logits
+        probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
+    prediction = {
+        labels_list[i]: round(probs[i], 3) for i in range(len(probs))
+    }
+    return prediction
+# Gradio Interface
+iface = gr.Interface(
+    fn=classify_image,
+    inputs=gr.Image(type="numpy"),
+    outputs=gr.Label(num_top_classes=2, label="Deepfake Detection"),
+    title="deepfake-detector-model",
+    description="Upload an image to detect whether it is AI-generated (fake) or a real photograph (real), using the OpenDeepfake-Preview dataset."
+)
+if __name__ == "__main__":
+    iface.launch()
+```
+---
+## Intended Use
+`deepfake-detector-model` is designed for:
+* **Deepfake Detection** – Identify AI-generated or manipulated images.
+* **Content Moderation** – Flag synthetic or fake visual content.
+* **Dataset Curation** – Remove synthetic samples from mixed datasets.
+* **Visual Authenticity Verification** – Check the integrity of visual media.
+* **Digital Forensics** – Support image source verification and traceability.