Update README.md

73f5f27 verified 5 months ago

4.2 kB

	---
	license: mit
	pipeline_tag: image-classification
	tags:
	- image-classification
	- timm
	- transformers
	- detection
	- deepfake
	- forensics
	- deepfake_detection
	- community
	- opensight
	base_model:
	- timm/vit_small_patch16_384.augreg_in21k_ft_in1k
	library_name: transformers
	widget:
	- src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/tiger.jpg
	example_title: Tiger
	- src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/teapot.jpg
	example_title: Teapot
	---

	# Trained on 2.7M samples across 4,803 generators (see Training Data)

	Uploaded for community validation as part of OpenSight - An upcoming open-source framework for adaptive deepfake detection, inspired by methodologies in <source_id data="2411.04125v1.pdf" />.

	### Huggingface Spaces coming soon.

	## Model Details
	### Model Description
	Vision Transformer (ViT) model trained on the largest dataset to-date for detecting AI-generated images in forensic applications.

	- Developed by: Jeongsoo Park and Andrew Owens, University of Michigan
	- Model type: Vision Transformer (ViT-Small)
	- License: MIT (compatible with CreativeML OpenRAIL-M referenced in [2411.04125v1.pdf])
	- Finetuned from: timm/vit_small_patch16_384.augreg_in21k_ft_in1k

	### Model Sources
	- Repository: [JeongsooP/Community-Forensics](https://github.com/JeongsooP/Community-Forensics)
	- Paper: [arXiv:2411.04125](https://arxiv.org/pdf/2411.04125)

	## Uses
	### Direct Use
	Detect AI-generated images in:
	- Content moderation pipelines
	- Digital forensic investigations

	## Bias, Risks, and Limitations
	- Performance variance: Accuracy drops 15-20% on diffusion-generated images vs GAN-generated
	- Geometric artifacts: Struggles with rotated/flipped synthetic images
	- Data bias: Trained primarily on LAION and COCO derivatives ([source][2411.04125v1.pdf])
	- ADDED BY UPLOADER: Model is already out of date, fails to detect images on newer generation models.

	## Compatibility Notice
	This repository contains a Hugging Face transformers-compatible convert for the original detection methodology from:

	Original Work
	"Community Forensics: Using Thousands of Generators to Train Fake Image Detectors"
	[arXiv:2411.04125](https://arxiv.org/abs/2411.04125v1) {{Citation from <source_id>2411.04125v1.pdf}}

	Our Contributions (Coming soon)
	⎯ Conversion of original weights to HF format
	⎯ Added PyTorch inference pipeline
	⎯ Standardized model card documentation

	No Training Performed
	⎯ Initial model weights sourced from paper authors
	⎯ No architectural changes or fine-tuning applied

	Verify Original Performance
	Please refer to Table 3 in <source_id data="2411.04125v1.pdf" /> for baseline metrics.

	## How to Use

	```python
	from transformers import ViTImageProcessor, ViTForImageClassification

	processor = ViTImageProcessor.from_pretrained("[your_model_id]")
	model = ViTForImageClassification.from_pretrained("[your_model_id]")

	inputs = processor(images=image, return_tensors="pt")
	outputs = model(**inputs)
	predicted_class = outputs.logits.argmax(-1)
	```

	## Training Details
	### Training Data
	- 2.7mil images from 15+ generators, 4600+ models
	- Over 1.15TB worth of images

	### Training Hyperparameters
	- Framework: PyTorch 2.0
	- Precision: bf16 mixed
	- Optimizer: AdamW (lr=5e-5)
	- Epochs: 10
	- Batch Size: 32

	## Evaluation
	### Testing Data
	- 10k held-out images (5k real/5k synthetic) from unseen Diffusion/GAN models

	\| Metric \| Value \|
	\|---------------\|-------\|
	\| Accuracy \| 97.2% \|
	\| F1 Score \| 0.968 \|
	\| AUC-ROC \| 0.992 \|
	\| FP Rate \| 2.1% \|

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/639daf827270667011153fbc/g-dLzxLBw1RAuiplvFCxh.png)

	## Citation
	BibTeX:
	```bibtex
	@misc{park2024communityforensics,
	title={Community Forensics: Using Thousands of Generators to Train Fake Image Detectors},
	author={Jeongsoo Park and Andrew Owens},
	year={2024},
	eprint={2411.04125},
	archivePrefix={arXiv},
	primaryClass={cs.CV},
	url={https://arxiv.org/abs/2411.04125},
	}
	```

	Model Card Authors:

	Jeongsoo Park, Andrew Owens