title: MCP Toolkit - Deepfake Detection & Forensics
description: MCP Server for Deepfake Detection & Digital Forensics Tools
emoji: 🚑
colorFrom: yellow
colorTo: yellow
sdk: gradio
sdk_version: 5.33.0
app_file: app_test.py
pinned: true
models:
- aiwithoutborders-xyz/OpenSight-CommunityForensics-Deepfake-ViT
- Heem2/AI-vs-Real-Image-Detection
- haywoodsloan/ai-image-detector-deploy
- cmckinle/sdxl-flux-detector
- Organika/sdxl-detector
license: mit
Functions Available for LLM Calls via MCP
This document outlines the functions available for programmatic invocation by LLMs through the MCP (Multi-Cloud Platform) server, as defined in mcp-deepfake-forensics/app_mcp.py
.
1. predict_with_ensemble
Description
This function processes an uploaded image to predict whether it is AI-generated or real, utilizing an ensemble of deepfake detection models and advanced forensic analysis techniques. It also incorporates intelligent agents for context inference, weight management, and anomaly detection.
API Names
predict
augment_then_predict
(This API name triggers image augmentation before prediction)
Parameters
img
(PIL Image): The input image to be analyzed. This can be uploaded by the user or captured via webcam.confidence_threshold
(float): A value between 0.0 and 1.0 (default: 0.7) that determines the confidence level required for a model to label an image as "AI" or "REAL". If neither score meets this threshold, the label will be "UNCERTAIN".augment_methods
(list of str): A list of augmentation methods to apply to the image before prediction. Possible values include: "rotate", "add_noise", "sharpen". If empty, no augmentation is applied.rotate_degrees
(float): The maximum degree by which to rotate the image (default: 0), if "rotate" is included inaugment_methods
.noise_level
(float): The level of noise to add to the image (default: 0), if "add_noise" is included inaugment_methods
.sharpen_strength
(float): The strength of the sharpening effect to apply (default: 0), if "sharpen" is included inaugment_methods
.
Returns
img_pil
(PIL Image): The processed image (original or augmented).cleaned_forensics_images
(list of PIL Image): A list of images generated by various forensic analysis techniques (ELA, gradient, minmax, bitplane). These include:- Original augmented image
- ELA analysis (multiple passes)
- Gradient processing (multiple variations)
- MinMax processing (multiple variations)
- Bit Plane extraction
table_rows
(list of lists): A list of lists representing the model predictions, suitable for display in a Gradio Dataframe. Each inner list contains: Model Name, Contributor, AI Score, Real Score, and Label.json_results
(str): A JSON string containing the raw model prediction results for debugging purposes.consensus_html
(str): An HTML string representing the final consensus label ("AI", "REAL", or "UNCERTAIN"), styled with color.
2. wavelet_blocking_noise_estimation
Description
Analyzes image noise patterns using wavelet decomposition. This tool helps detect compression artifacts and artificial noise patterns that may indicate image manipulation. Higher noise levels in specific regions can reveal areas of potential tampering.
API Name
tool_waveletnoise
Parameters
image
(PIL Image): The input image to analyze.block_size
(int): The size of the blocks for wavelet analysis (default: 8, range: 1-32).
Returns
output_image
(PIL Image): An image visualizing the noise patterns.
3. bit_plane_extractor
Description
Extracts and visualizes individual bit planes from different color channels. This forensic tool helps identify hidden patterns and artifacts in image data that may indicate manipulation. Different bit planes can reveal inconsistencies in image processing or editing.
API Name
tool_bitplane
Parameters
image
(PIL Image): The input image to analyze.channel
(str): The color channel to extract the bit plane from. Possible values: "Luminance", "Red", "Green", "Blue", "RGB Norm" (default: "Luminance").bit_plane
(int): The bit plane index to extract (0-7, default: 0).filter_type
(str): A filter to apply to the extracted bit plane. Possible values: "Disabled", "Median", "Gaussian" (default: "Disabled").
Returns
output_image
(PIL Image): An image visualizing the extracted bit plane.
4. ELA
Description
Performs Error Level Analysis to detect re-saved JPEG images, which can indicate tampering. ELA highlights areas of an image that have different compression levels.
API Name
tool_ela
Parameters
img
(PIL Image): Input image to analyze.quality
(int): JPEG compression quality (1-100, default: 75).scale
(int): Output multiplicative gain (1-100, default: 50).contrast
(int): Output tonality compression (0-100, default: 20).linear
(bool): Whether to use linear difference (default: False).grayscale
(bool): Whether to output grayscale image (default: False).
Returns
processed_ela_image
(PIL Image): The processed ELA image.
5. gradient_processing
Description
Applies gradient filters to an image to enhance edges and transitions, which can reveal inconsistencies due to manipulation.
API Name
tool_gradient_processing
Parameters
image
(PIL Image): The input image to analyze.intensity
(int): Intensity of the gradient effect (0-100, default: 90).blue_mode
(str): Mode for the blue channel. Possible values: "Abs", "None", "Flat", "Norm" (default: "Abs").invert
(bool): Whether to invert the gradients (default: False).equalize
(bool): Whether to equalize the histogram (default: False).
Returns
gradient_image
(PIL Image): The image with gradient processing applied.
6. minmax_process
Description
Analyzes local pixel value deviations to detect subtle changes in image data, often indicative of digital forgeries.
API Name
tool_minmax_processing
Parameters
image
(PIL Image): The input image to analyze.channel
(int): The color channel to process. Possible values: 0 (Grayscale), 1 (Blue), 2 (Green), 3 (Red), 4 (RGB Norm) (default: 4).radius
(int): The radius for local pixel analysis (0-10, default: 2).
Returns
minmax_image
(PIL Image): The image with minmax processing applied.
Behind the Scenes: Image Prediction Flow
When you upload an image for analysis and click the "Predict" or "Augment & Predict" button, the following steps occur:
1. Image Pre-processing and Agent Initialization
- Image Conversion: The input image is first ensured to be a PIL (Pillow) Image object. If it's a NumPy array, it's converted.
- Agent Setup: Several intelligent agents are initialized to assist in the process:
EnsembleMonitorAgent
: Monitors the performance of individual models.ModelWeightManager
: Manages and adjusts the weights of different models.WeightOptimizationAgent
: Optimizes model weights based on performance.SystemHealthAgent
: Monitors the system's resource usage (e.g., memory, GPU).ContextualIntelligenceAgent
: Infers context tags from the image to aid in weight adjustment.ForensicAnomalyDetectionAgent
: Analyzes forensic outputs for signs of manipulation.
- System Health Monitoring: The
SystemHealthAgent
performs an initial check of system resources. - Image Augmentation (Optional): If you select augmentation methods (rotate, add noise, sharpen), the image is augmented accordingly. Otherwise, the original image is used.
2. Initial Model Predictions
- Individual Model Inference: The augmented (or original) image is passed through each of the registered deepfake detection models (
model_1
throughmodel_7
). - Performance Monitoring: For each model, the
EnsembleMonitorAgent
tracks its prediction label, confidence score, and inference time. - Result Collection: The raw prediction results (AI Score, Real Score, predicted Label) from each model are stored.
3. Smart Agent Processing and Weighted Consensus
- Contextual Intelligence: The
ContextualIntelligenceAgent
analyzes the image's metadata (width, height, mode) and the raw model predictions to infer relevant context tags (e.g., "generated by Midjourney", "likely real photo"). This helps in making more informed decisions about model reliability. - Dynamic Weight Adjustment: The
ModelWeightManager
adjusts the influence (weights) of each individual model's prediction. This adjustment takes into account the initial model predictions, their confidence scores, and the detected context tags. - Weighted Consensus Calculation: A final prediction label ("AI", "REAL", or "UNCERTAIN") is determined by combining the individual model predictions using their adjusted weights. Models with higher confidence and relevance to the detected context contribute more to the final decision.
- Performance Analysis (for Optimization): The
WeightOptimizationAgent
analyzes the final consensus label to continually improve the weight adjustment strategy for future predictions.
4. Forensic Processing
- Multiple Forensic Techniques: The original image is subjected to various forensic analysis techniques to reveal hidden artifacts that might indicate manipulation:
- Gradient Processing: Highlights edges and transitions in the image.
- MinMax Processing: Reveals deviations in local pixel values.
- ELA (Error Level Analysis): Performed in multiple passes (grayscale and color, with varying contrast) to detect areas of different compression levels, which can suggest tampering.
- Forensic Anomaly Detection: The
ForensicAnomalyDetectionAgent
analyzes the outputs of these forensic tools and their descriptions to identify potential anomalies or inconsistencies that could indicate image manipulation.
5. Data Logging and Output Generation
- Inference Data Logging: All relevant data from the current prediction, including original image, inference parameters, individual model predictions, ensemble output, forensic images, and agent monitoring data, is logged to a Hugging Face dataset for continuous improvement and analysis.
- Output Preparation: The results are formatted for display in the Gradio interface:
- The processed image (augmented or original) is prepared.
- The forensic analysis images are collected for display in a gallery.
- A table summarizing each model's prediction (Model, Contributor, AI Score, Real Score, Label) is generated.
- The raw JSON output of model results is prepared for debugging.
- The final consensus label is prepared with appropriate styling.
- Data Type Conversion: Numerical values (like AI Score, Real Score) are converted to standard Python floats to ensure proper JSON serialization.
Finally, all these prepared outputs are returned to the Gradio interface for you to view.