prithivMLmods commited on
Commit
7b35fb0
·
verified ·
1 Parent(s): 19c1260

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -102
README.md CHANGED
@@ -7,135 +7,91 @@ tags:
7
  - ViT
8
  - detection
9
  - Image
10
- - transformers-4.49.0.dev0
11
  base_model:
12
  - google/vit-base-patch16-224-in21k
 
 
 
 
13
  ---
14
 
15
- ![df[ViT].gif](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/Xbuv-x40-l3QjzWu5Yj2F.gif)
16
 
17
- # **Deep-Fake-Detector-Model**
18
 
19
- # **Overview**
20
-
21
- The **Deep-Fake-Detector-Model** is a state-of-the-art deep learning model designed to detect deepfake images. It leverages the **Vision Transformer (ViT)** architecture, specifically the `google/vit-base-patch16-224-in21k` model, fine-tuned on a dataset of real and deepfake images. The model is trained to classify images as either "Real" or "Fake" with high accuracy, making it a powerful tool for detecting manipulated media.
22
 
23
- **<span style="color:red;">Update :</span>** The previous model checkpoint was obtained using a smaller classification dataset. Although it performed well in evaluation scores, its real-time performance was average due to limited variations in the training set. The new update includes a larger dataset to improve the detection of fake images.
24
 
25
- | Repository | Link |
26
- |------------|------|
27
- | Deep Fake Detector Model | [GitHub Repository](https://github.com/PRITHIVSAKTHIUR/Deep-Fake-Detector-Model) |
28
 
29
- # **Key Features**
30
- - **Architecture**: Vision Transformer (ViT) - `google/vit-base-patch16-224-in21k`.
31
- - **Input**: RGB images resized to 224x224 pixels.
32
- - **Output**: Binary classification ("Real" or "Fake").
33
- - **Training Dataset**: A curated dataset of real and deepfake images.
34
- - **Fine-Tuning**: The model is fine-tuned using Hugging Face's `Trainer` API with advanced data augmentation techniques.
35
- - **Performance**: Achieves high accuracy and F1 score on validation and test datasets.
36
 
37
- # **Model Architecture**
38
- The model is based on the **Vision Transformer (ViT)**, which treats images as sequences of patches and applies a transformer encoder to learn spatial relationships. Key components include:
39
- - **Patch Embedding**: Divides the input image into fixed-size patches (16x16 pixels).
40
- - **Transformer Encoder**: Processes patch embeddings using multi-head self-attention mechanisms.
41
- - **Classification Head**: A fully connected layer for binary classification.
42
 
43
- # **Training Details**
44
- - **Optimizer**: AdamW with a learning rate of `1e-6`.
45
- - **Batch Size**: 32 for training, 8 for evaluation.
46
- - **Epochs**: 2.
47
- - **Data Augmentation**:
48
- - Random rotation (±90 degrees).
49
- - Random sharpness adjustment.
50
- - Random resizing and cropping.
51
- - **Loss Function**: Cross-Entropy Loss.
52
- - **Evaluation Metrics**: Accuracy, F1 Score, and Confusion Matrix.
53
 
54
- # **Inference with Hugging Face Pipeline**
55
- ```python
56
- from transformers import pipeline
57
 
58
- # Load the model
59
- pipe = pipeline('image-classification', model="prithivMLmods/Deep-Fake-Detector-Model", device=0)
60
 
61
- # Predict on an image
62
- result = pipe("path_to_image.jpg")
63
- print(result)
64
- ```
65
 
66
- # **Inference with PyTorch**
67
  ```python
68
- from transformers import ViTForImageClassification, ViTImageProcessor
 
69
  from PIL import Image
70
  import torch
71
 
72
- # Load the model and processor
73
- model = ViTForImageClassification.from_pretrained("prithivMLmods/Deep-Fake-Detector-Model")
74
- processor = ViTImageProcessor.from_pretrained("prithivMLmods/Deep-Fake-Detector-Model")
 
75
 
76
- # Load and preprocess the image
77
- image = Image.open("path_to_image.jpg").convert("RGB")
78
- inputs = processor(images=image, return_tensors="pt")
79
 
80
- # Perform inference
81
- with torch.no_grad():
82
- outputs = model(**inputs)
83
- logits = outputs.logits
84
- predicted_class = torch.argmax(logits, dim=1).item()
85
 
86
- # Map class index to label
87
- label = model.config.id2label[predicted_class]
88
- print(f"Predicted Label: {label}")
89
- ```
90
- # **Performance Metrics**
91
- ```
92
- Classification report:
93
-
94
- precision recall f1-score support
95
 
96
- Real 0.6276 0.9823 0.7659 38054
97
- Fake 0.9594 0.4176 0.5819 38080
98
-
99
- accuracy 0.6999 76134
100
- macro avg 0.7935 0.7000 0.6739 76134
101
- weighted avg 0.7936 0.6999 0.6739 76134
102
- ```
103
 
104
- ![Untitled.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/MoxwukbZZZuVpvXHstxsw.png)
105
 
106
- - **Confusion Matrix**:
107
- ```
108
- [[True Positives, False Negatives],
109
- [False Positives, True Negatives]]
110
- ```
 
 
 
111
 
112
- # **Dataset**
113
- The model is fine-tuned on the dataset, which contains:
114
- - **Real Images**: Authentic images of human faces.
115
- - **Fake Images**: Deepfake images generated using advanced AI techniques.
116
-
117
- # **Limitations**
118
- The model is trained on a specific dataset and may not generalize well to other deepfake datasets or domains.
119
- - Performance may degrade on low-resolution or heavily compressed images.
120
- - The model is designed for image classification and does not detect deepfake videos directly.
121
-
122
- # **Ethical Considerations**
123
 
124
- **Misuse**: This model should not be used for malicious purposes, such as creating or spreading deepfakes.
125
- **Bias**: The model may inherit biases from the training dataset. Care should be taken to ensure fairness and inclusivity.
126
- **Transparency**: Users should be informed when deepfake detection tools are used to analyze their content.
127
 
128
- # **Future Work**
129
- - Extend the model to detect deepfake videos.
130
- - Improve generalization by training on larger and more diverse datasets.
131
- - Incorporate explainability techniques to provide insights into model predictions.
132
 
133
- # **Citation**
134
 
135
- ```bibtex
136
- @misc{Deep-Fake-Detector-Model,
137
- author = {prithivMLmods},
138
- title = {Deep-Fake-Detector-Model},
139
- initial = {21 Mar 2024},
140
- last_updated = {31 Jan 2025}
141
- }
 
7
  - ViT
8
  - detection
9
  - Image
 
10
  base_model:
11
  - google/vit-base-patch16-224-in21k
12
+ datasets:
13
+ - prithivMLmods/OpenDeepfake-Preview
14
+ language:
15
+ - en
16
  ---
17
 
18
+ # deepfake-detector-model
19
 
20
+ > deepfake-detector-model is a vision-language model fine-tuned from `google/vit-base-patch16-224-in21k` for binary image classification. It is trained to detect whether an image is fake or real using the *OpenDeepfake-Preview* dataset. The model uses the `ViTForImageClassification` architecture.
21
 
22
+ ---
 
 
23
 
24
+ ## Label Space: 2 Classes
25
 
26
+ The model classifies an image as either:
 
 
27
 
28
+ ```
29
+ Class 0: fake
30
+ Class 1: real
31
+ ```
 
 
 
32
 
33
+ ---
 
 
 
 
34
 
35
+ ## Install Dependencies
 
 
 
 
 
 
 
 
 
36
 
37
+ ```bash
38
+ pip install -q transformers torch pillow gradio hf_xet
39
+ ```
40
 
41
+ ---
 
42
 
43
+ ## Inference Code
 
 
 
44
 
 
45
  ```python
46
+ import gradio as gr
47
+ from transformers import ViTImageProcessor, ViTForImageClassification
48
  from PIL import Image
49
  import torch
50
 
51
+ # Load model and processor
52
+ model_name = "your-username/deepfake-detector-model"
53
+ model = ViTForImageClassification.from_pretrained(model_name)
54
+ processor = ViTImageProcessor.from_pretrained(model_name)
55
 
56
+ # Updated label mapping
57
+ labels_list = ['fake', 'real']
 
58
 
59
+ def classify_image(image):
60
+ image = Image.fromarray(image).convert("RGB")
61
+ inputs = processor(images=image, return_tensors="pt")
 
 
62
 
63
+ with torch.no_grad():
64
+ outputs = model(**inputs)
65
+ logits = outputs.logits
66
+ probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
 
 
 
 
 
67
 
68
+ prediction = {
69
+ labels_list[i]: round(probs[i], 3) for i in range(len(probs))
70
+ }
 
 
 
 
71
 
72
+ return prediction
73
 
74
+ # Gradio Interface
75
+ iface = gr.Interface(
76
+ fn=classify_image,
77
+ inputs=gr.Image(type="numpy"),
78
+ outputs=gr.Label(num_top_classes=2, label="Deepfake Detection"),
79
+ title="deepfake-detector-model",
80
+ description="Upload an image to detect whether it is AI-generated (fake) or a real photograph (real), using the OpenDeepfake-Preview dataset."
81
+ )
82
 
83
+ if __name__ == "__main__":
84
+ iface.launch()
85
+ ```
 
 
 
 
 
 
 
 
86
 
87
+ ---
 
 
88
 
89
+ ## Intended Use
 
 
 
90
 
91
+ `deepfake-detector-model` is designed for:
92
 
93
+ * **Deepfake Detection** – Identify AI-generated or manipulated images.
94
+ * **Content Moderation** – Flag synthetic or fake visual content.
95
+ * **Dataset Curation** – Remove synthetic samples from mixed datasets.
96
+ * **Visual Authenticity Verification** – Check the integrity of visual media.
97
+ * **Digital Forensics** Support image source verification and traceability.