Image Classification
Transformers
Safetensors
swin
ai-gen-images
File size: 4,705 Bytes
1a9bf05
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
---
datasets:
- ideepankarsharma2003/ImageClassificationStableDiffusion_small
- ideepankarsharma2003/Midjourney_v6_Classification_small_shuffled
- ideepankarsharma2003/AIGeneratedImages_Midjourney
tags:
- image-classification
- ai-gen-images
---


# Model Card for AI Image Classification - Midjourney V6 & SDXL  

## Model Details  

### Model Description  

This model is a **Swin Transformer-based classifier** designed to distinguish between **AI-generated** and **human-created** images, specifically focusing on outputs from **Midjourney V6** and **Stable Diffusion XL (SDXL)**. It has been trained on a curated dataset of AI-generated images.  

- **Developed by:** Deepankar Sharma  
- **Model type:** Image Classification (Swin Transformer)  
- **Finetuned from model:** SwinForImageClassification  

### Model Sources  

- **Repository:** [Hugging Face Model Repository](https://huggingface.co/ideepankarsharma2003/AI_ImageClassification_MidjourneyV6_SDXL)  

## Uses  

### Direct Use  

This model can be used for **detecting AI-generated images** from Midjourney V6 and SDXL. It is useful for content moderation, fact-checking, and detecting synthetic media.  

### Out-of-Scope Use  

- The model is **not designed** for detecting AI-generated images from all generative models.  
- It **may not perform well** on heavily edited AI-generated images or images mixed with human elements.  
- It is **not intended for forensic-level deepfake detection**.  

## Bias, Risks, and Limitations  

This model is trained specifically on **Midjourney V6** and **Stable Diffusion XL** datasets. It may not generalize well to images generated by other AI models. Additionally, biases in the dataset could lead to **false positives** (flagging real images as AI-generated) or **false negatives** (failing to detect AI-generated content).  

### Recommendations  

Users should verify results with additional tools and **not solely rely on this model** for high-stakes decisions. Model performance should be tested on domain-specific datasets before deployment.  

## How to Get Started with the Model  

You can use this model with the 🤗 Transformers library:  

```python
from transformers import AutoModelForImageClassification, AutoFeatureExtractor
from PIL import Image
import torch

# Load model and feature extractor
model_name = "ideepankarsharma2003/AI_ImageClassification_MidjourneyV6_SDXL"
model = AutoModelForImageClassification.from_pretrained(model_name)
feature_extractor = AutoFeatureExtractor.from_pretrained(model_name)

# Load and preprocess image
image = Image.open("path_to_image.jpg")
inputs = feature_extractor(images=image, return_tensors="pt")

# Perform inference
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits
    predicted_label = logits.argmax(-1).item()

# Label Mapping
id2label = {0: "ai_gen", 1: "human"}
print("Predicted label:", id2label[predicted_label])
```

## Training Details  

### Training Data  

The model was trained on the following datasets:  

- [ImageClassificationStableDiffusion_small](https://huggingface.co/datasets/ideepankarsharma2003/ImageClassificationStableDiffusion_small)  
- [Midjourney_v6_Classification_small_shuffled](https://huggingface.co/datasets/ideepankarsharma2003/Midjourney_v6_Classification_small_shuffled)  
- [AIGeneratedImages_Midjourney](https://huggingface.co/datasets/ideepankarsharma2003/AIGeneratedImages_Midjourney)  

### Training Procedure  

- **Image Size:** 224x224  
- **Patch Size:** 4  
- **Embedding Dimension:** 128  
- **Layers:** 4  
- **Attention Heads per Stage:** [4, 8, 16, 32]  
- **Dropout Rates:**  
  - Attention: 0.0  
  - Hidden: 0.0  
  - Drop Path: 0.1  
- **Activation Function:** GeLU  
- **Optimizer:** AdamW  
- **Learning Rate Scheduler:** Cosine Annealing  
- **Precision:** float32  
- **Training Steps:** 3414  

## Evaluation  

### Testing Data, Factors & Metrics  

#### Testing Data  

The model was evaluated on a separate validation split from the training datasets.  

#### Metrics  

- **Accuracy**  
- **Precision & Recall**  
- **F1 Score**  

### Summary  

The model effectively distinguishes between AI-generated and human-created images, but its performance may be affected by dataset biases and out-of-distribution examples.  

## Citation  

If you use this model, please cite:  

```bibtex
@misc{ai_image_classification,
  author = {Deepankar Sharma},
  title = {AI Image Classification - Midjourney V6 & SDXL},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/ideepankarsharma2003/AI_ImageClassification_MidjourneyV6_SDXL}}
}
```

## Model Card Authors  

- **Author:** Deepankar Sharma  

---