File size: 2,573 Bytes
5b30890 8ece166 5b30890 48a7142 6384ee1 61fac3e 5b30890 22af25c 79e4799 22af25c 79e4799 5b30890 48a7142 79e4799 48a7142 5b30890 48a7142 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 8ece166 5b30890 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 |
---
license: unknown
language:
- en
metrics:
- accuracy
- precision
- f1
- recall
tags:
- art
base_model: google/vit-base-patch16-224
datasets:
- DataScienceProject/Art_Images_Ai_And_Real_
pipeline_tag: image-classification
library_name: transformers
---
### Model Card for Model ID
This model is designed for classifying images as either 'real' or 'fake-AI generated' using a Vision Transformer (VIT) .
Our goal is to accurately classify the source of the image with at least 85% accuracy and achieve at least 80% in the recall test.
### Model Description
This model leverages the Vision Transformer (ViT) architecture, which applies self-attention mechanisms to process images.
The model classifies images into two categories: 'real ' and 'fake - AI generated'.
It captures intricate patterns and features that help in distinguishing between the two categories without the need for Convolutional Neural Networks (CNNs).
### Direct Use
This model can be used to classify images as 'real art' or 'fake art' based on visual features learned by the Vision Transformer.
### Out-of-Scope Use
The model may not perform well on images outside the scope of art or where the visual characteristics are drastically different from those in the training dataset.
### Recommendations
Run the traning code on pc with an nvidia gpu better then rtx 3060 and at least 6 core cpu / use google collab.
## How to Get Started with the Model
Prepare Data: Organize your images into appropriate folders and run the code.
## model architecture

## Training Details
-Dataset: DataScienceProject/Art_Images_Ai_And_Real_
Preprocessing: Images are resized, converted to 'rgb' format , transformed into tensor and stored in special torch dataset.
#### Training Hyperparameters
optimizer = optim.Adam(model.parameters(), lr=0.001)
num_epochs = 10
criterion = nn.CrossEntropyLoss()
## Evaluation
The model takes 15-20 minutes to run , based on our dataset , equipped with the following pc hardware: cpu :i9 13900 ,ram: 32gb , gpu: rtx 3080
your mileage may vary.
### Testing Data, Factors & Metrics
-precision
-recall
-f1
-confusion_matrix
-accuracy
### Results
-test accuracy = 0.92
-precision = 0.893
-recall = 0.957
-f1 = 0.924
-

#### Summary
This model is by far the best of what we tried (CNN , Resnet , CNN + ELA).
|