metadata

title: MCP Toolkit - Deepfake Detection & Forensics
description: MCP Server for Deepfake Detection & Digital Forensics Tools
emoji: 🚑
colorFrom: yellow
colorTo: yellow
sdk: gradio
sdk_version: 5.33.0
app_file: app_test.py
pinned: true
models:
  - aiwithoutborders-xyz/OpenSight-CommunityForensics-Deepfake-ViT
  - Heem2/AI-vs-Real-Image-Detection
  - haywoodsloan/ai-image-detector-deploy
  - cmckinle/sdxl-flux-detector
  - Organika/sdxl-detector
license: mit

Functions Available for LLM Calls via MCP

This document outlines the functions available for programmatic invocation by LLMs through the MCP (Multi-Cloud Platform) server, as defined in mcp-deepfake-forensics/app_mcp.py.

1. `predict_with_ensemble`

Description

This function processes an uploaded image to predict whether it is AI-generated or real, utilizing an ensemble of deepfake detection models and advanced forensic analysis techniques. It also incorporates intelligent agents for context inference, weight management, and anomaly detection.

API Names

predict
augment_then_predict (This API name triggers image augmentation before prediction)

Parameters

img (PIL Image): The input image to be analyzed. This can be uploaded by the user or captured via webcam.
confidence_threshold (float): A value between 0.0 and 1.0 (default: 0.7) that determines the confidence level required for a model to label an image as "AI" or "REAL". If neither score meets this threshold, the label will be "UNCERTAIN".
augment_methods (list of str): A list of augmentation methods to apply to the image before prediction. Possible values include: "rotate", "add_noise", "sharpen". If empty, no augmentation is applied.
rotate_degrees (float): The maximum degree by which to rotate the image (default: 0), if "rotate" is included in augment_methods.
noise_level (float): The level of noise to add to the image (default: 0), if "add_noise" is included in augment_methods.
sharpen_strength (float): The strength of the sharpening effect to apply (default: 0), if "sharpen" is included in augment_methods.

Returns

img_pil (PIL Image): The processed image (original or augmented).
cleaned_forensics_images (list of PIL Image): A list of images generated by various forensic analysis techniques (ELA, gradient, minmax, bitplane). These include:
- Original augmented image
- ELA analysis (multiple passes)
- Gradient processing (multiple variations)
- MinMax processing (multiple variations)
- Bit Plane extraction
table_rows (list of lists): A list of lists representing the model predictions, suitable for display in a Gradio Dataframe. Each inner list contains: Model Name, Contributor, AI Score, Real Score, and Label.
json_results (str): A JSON string containing the raw model prediction results for debugging purposes.
consensus_html (str): An HTML string representing the final consensus label ("AI", "REAL", or "UNCERTAIN"), styled with color.

2. `wavelet_blocking_noise_estimation`

Description

Analyzes image noise patterns using wavelet decomposition. This tool helps detect compression artifacts and artificial noise patterns that may indicate image manipulation. Higher noise levels in specific regions can reveal areas of potential tampering.

API Name

tool_waveletnoise

Parameters

image (PIL Image): The input image to analyze.
block_size (int): The size of the blocks for wavelet analysis (default: 8, range: 1-32).

Returns

output_image (PIL Image): An image visualizing the noise patterns.

3. `bit_plane_extractor`

Description

Extracts and visualizes individual bit planes from different color channels. This forensic tool helps identify hidden patterns and artifacts in image data that may indicate manipulation. Different bit planes can reveal inconsistencies in image processing or editing.

API Name

tool_bitplane

Parameters

image (PIL Image): The input image to analyze.
channel (str): The color channel to extract the bit plane from. Possible values: "Luminance", "Red", "Green", "Blue", "RGB Norm" (default: "Luminance").
bit_plane (int): The bit plane index to extract (0-7, default: 0).
filter_type (str): A filter to apply to the extracted bit plane. Possible values: "Disabled", "Median", "Gaussian" (default: "Disabled").

Returns

output_image (PIL Image): An image visualizing the extracted bit plane.

4. `ELA`

Description

Performs Error Level Analysis to detect re-saved JPEG images, which can indicate tampering. ELA highlights areas of an image that have different compression levels.

API Name

tool_ela

Parameters

img (PIL Image): Input image to analyze.
quality (int): JPEG compression quality (1-100, default: 75).
scale (int): Output multiplicative gain (1-100, default: 50).
contrast (int): Output tonality compression (0-100, default: 20).
linear (bool): Whether to use linear difference (default: False).
grayscale (bool): Whether to output grayscale image (default: False).

Returns

processed_ela_image (PIL Image): The processed ELA image.

5. `gradient_processing`

Description

Applies gradient filters to an image to enhance edges and transitions, which can reveal inconsistencies due to manipulation.

API Name

tool_gradient_processing

Parameters

image (PIL Image): The input image to analyze.
intensity (int): Intensity of the gradient effect (0-100, default: 90).
blue_mode (str): Mode for the blue channel. Possible values: "Abs", "None", "Flat", "Norm" (default: "Abs").
invert (bool): Whether to invert the gradients (default: False).
equalize (bool): Whether to equalize the histogram (default: False).

Returns

gradient_image (PIL Image): The image with gradient processing applied.

6. `minmax_process`

Description

Analyzes local pixel value deviations to detect subtle changes in image data, often indicative of digital forgeries.

API Name

tool_minmax_processing

Parameters

image (PIL Image): The input image to analyze.
channel (int): The color channel to process. Possible values: 0 (Grayscale), 1 (Blue), 2 (Green), 3 (Red), 4 (RGB Norm) (default: 4).
radius (int): The radius for local pixel analysis (0-10, default: 2).

Returns

minmax_image (PIL Image): The image with minmax processing applied.

Behind the Scenes: Image Prediction Flow

When you upload an image for analysis and click the "Predict" or "Augment & Predict" button, the following steps occur:

1. Image Pre-processing and Agent Initialization

Image Conversion: The input image is first ensured to be a PIL (Pillow) Image object. If it's a NumPy array, it's converted.
Agent Setup: Several intelligent agents are initialized to assist in the process:
- EnsembleMonitorAgent: Monitors the performance of individual models.
- ModelWeightManager: Manages and adjusts the weights of different models.
- WeightOptimizationAgent: Optimizes model weights based on performance.
- SystemHealthAgent: Monitors the system's resource usage (e.g., memory, GPU).
- ContextualIntelligenceAgent: Infers context tags from the image to aid in weight adjustment.
- ForensicAnomalyDetectionAgent: Analyzes forensic outputs for signs of manipulation.
System Health Monitoring: The SystemHealthAgent performs an initial check of system resources.
Image Augmentation (Optional): If you select augmentation methods (rotate, add noise, sharpen), the image is augmented accordingly. Otherwise, the original image is used.

2. Initial Model Predictions

Individual Model Inference: The augmented (or original) image is passed through each of the registered deepfake detection models (model_1 through model_7).
Performance Monitoring: For each model, the EnsembleMonitorAgent tracks its prediction label, confidence score, and inference time.
Result Collection: The raw prediction results (AI Score, Real Score, predicted Label) from each model are stored.

3. Smart Agent Processing and Weighted Consensus

Contextual Intelligence: The ContextualIntelligenceAgent analyzes the image's metadata (width, height, mode) and the raw model predictions to infer relevant context tags (e.g., "generated by Midjourney", "likely real photo"). This helps in making more informed decisions about model reliability.
Dynamic Weight Adjustment: The ModelWeightManager adjusts the influence (weights) of each individual model's prediction. This adjustment takes into account the initial model predictions, their confidence scores, and the detected context tags.
Weighted Consensus Calculation: A final prediction label ("AI", "REAL", or "UNCERTAIN") is determined by combining the individual model predictions using their adjusted weights. Models with higher confidence and relevance to the detected context contribute more to the final decision.
Performance Analysis (for Optimization): The WeightOptimizationAgent analyzes the final consensus label to continually improve the weight adjustment strategy for future predictions.

4. Forensic Processing

Multiple Forensic Techniques: The original image is subjected to various forensic analysis techniques to reveal hidden artifacts that might indicate manipulation:
- Gradient Processing: Highlights edges and transitions in the image.
- MinMax Processing: Reveals deviations in local pixel values.
- ELA (Error Level Analysis): Performed in multiple passes (grayscale and color, with varying contrast) to detect areas of different compression levels, which can suggest tampering.
Forensic Anomaly Detection: The ForensicAnomalyDetectionAgent analyzes the outputs of these forensic tools and their descriptions to identify potential anomalies or inconsistencies that could indicate image manipulation.

5. Data Logging and Output Generation

Inference Data Logging: All relevant data from the current prediction, including original image, inference parameters, individual model predictions, ensemble output, forensic images, and agent monitoring data, is logged to a Hugging Face dataset for continuous improvement and analysis.
Output Preparation: The results are formatted for display in the Gradio interface:
- The processed image (augmented or original) is prepared.
- The forensic analysis images are collected for display in a gallery.
- A table summarizing each model's prediction (Model, Contributor, AI Score, Real Score, Label) is generated.
- The raw JSON output of model results is prepared for debugging.
- The final consensus label is prepared with appropriate styling.
Data Type Conversion: Numerical values (like AI Score, Real Score) are converted to standard Python floats to ensure proper JSON serialization.

Finally, all these prepared outputs are returned to the Gradio interface for you to view.

Functions Available for LLM Calls via MCP

1. predict_with_ensemble

Description

API Names

Parameters

Returns

2. wavelet_blocking_noise_estimation

Description

API Name

Parameters

Returns

3. bit_plane_extractor

Description

API Name

Parameters

Returns

4. ELA

Description

API Name

Parameters

Returns

5. gradient_processing

Description

API Name

Parameters

Returns

6. minmax_process

Description

API Name

Parameters

Returns

Behind the Scenes: Image Prediction Flow

1. Image Pre-processing and Agent Initialization

2. Initial Model Predictions

3. Smart Agent Processing and Weighted Consensus

4. Forensic Processing

5. Data Logging and Output Generation

1. `predict_with_ensemble`

2. `wavelet_blocking_noise_estimation`

3. `bit_plane_extractor`

4. `ELA`

5. `gradient_processing`

6. `minmax_process`