Spaces:
Running
on
Zero
A newer version of the Gradio SDK is available:
5.42.0
title: Phramer AI
emoji: π¬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.33.2
app_file: app.py
pinned: false
license: apache-2.0
tags:
- multimodal
- image-to-prompt
- flux
- midjourney
- generative-ai
- computer-vision
- cinematic
- photography
- bagel
- pariente-ai
Phramer AI
By Pariente AI, for MIA TV Series
Logline: Phramer AI is a multimodal tool that reads an image and turns it into a refined, photo-realistic prompt. Ready for Midjourney, Flux or any generative engine.
Overview
Phramer AI is an advanced multimodal system developed by Pariente AI for the MIA TV Series creative pipeline.
Upload any image, and Phramer AI will:
- Analyze it deeply using a custom Bagel architecture
- Generate a detailed semantic-visual description
- Enhance it using a curated photographic knowledge base
- Output a structured prompt with camera settings, composition hints, mood, and style β ready for Flux or other diffusion-based platforms
Whether you're creating cinematic storyboards, photorealistic scenes, or exploring visual concepts, Phramer AI bridges the gap between image understanding and generative prompting.
Key Features
π Deep Multimodal Analysis
- Custom Bagel-7B architecture for advanced image understanding
- Semantic-visual analysis with professional photography insights
- Context-aware scene detection and composition analysis
π― Multi-Engine Optimization
- Flux-ready prompts with technical specifications
- Midjourney compatibility with style and mood descriptors
- Universal format compatible with major generative engines
πΈ Professional Photography Knowledge
- Curated database of camera settings and equipment
- Lighting techniques and composition principles
- Technical parameters optimized for photorealistic output
π¬ Cinematic Focus
- Designed for TV series and film production workflows
- Storyboard and concept art optimization
- Dramatic lighting and mood analysis
How It Works
- Image Upload - Support for JPG, PNG, WebP formats up to 1024px
- Bagel Analysis - Custom architecture analyzes visual content and composition
- Knowledge Enhancement - Professional photography database enriches the analysis
- Prompt Generation - Structured output with technical details and artistic direction
- Multi-Engine Ready - Copy and use in Flux, Midjourney, or any diffusion platform
Technical Specifications
Architecture
- Base Model: Custom Bagel-7B multimodal architecture
- Vision Processing: Advanced semantic-visual understanding
- Knowledge Integration: Professional photography database with 30+ years expertise
- Output Optimization: Multi-engine compatibility layer
Processing Pipeline
- Image Preprocessing: Automatic optimization and format conversion
- Multimodal Analysis: Deep scene understanding with technical assessment
- Professional Enhancement: Camera, lighting, and composition recommendations
- Prompt Structuring: Organized output with technical and artistic elements
Supported Platforms
- Flux - Primary optimization target with technical specifications
- Midjourney - Style and mood descriptors
- Stable Diffusion - Technical parameter integration
- Other Engines - Universal prompt format compatibility
Use Cases
π¬ Film & TV Production
- Storyboard creation and visualization
- Concept art development
- Scene planning and mood reference
- Visual consistency across episodes
πΈ Photography Reference
- Lighting setup recreation
- Camera configuration guidance
- Composition analysis and improvement
- Technical parameter optimization
π¨ Creative Development
- Visual concept exploration
- Style reference generation
- Mood and atmosphere studies
- Character and environment design
πΌ Commercial Applications
- Product visualization
- Marketing material creation
- Brand consistency maintenance
- Commercial photography planning
Example Workflow
Input: Portrait photograph of a person in dramatic lighting
Phramer AI Analysis:
βββ Scene Detection: Studio portrait with dramatic side lighting
βββ Technical Analysis: Professional setup with controlled lighting
βββ Camera Recommendation: Canon EOS R5 with 85mm f/1.4 lens
βββ Enhancement: Cinematic mood with film-quality specifications
Output Prompt:
"A cinematic portrait of [subject description], shot on Canon EOS R5
with 85mm f/1.4 lens at f/2.8, dramatic side lighting with subtle rim
light, professional studio setup, film grain, photorealistic,
ultra-detailed, commercial photography style"
Quality Scoring
Phramer AI evaluates generated prompts across multiple dimensions:
- Prompt Quality (25%) - Content detail and description accuracy
- Technical Details (25%) - Camera settings and equipment specifications
- Professional Photography (25%) - Lighting, composition, and technical expertise
- Multi-Engine Optimization (25%) - Compatibility and enhancement features
Scores range from 0-100 with grades from POOR to LEGENDARY.
Installation & Usage
Requirements
- Python 3.8+
- CUDA-compatible GPU (recommended)
- 8GB+ RAM
- Internet connection for model access
Local Setup
git clone [repository-url]
cd phramer-ai
pip install -r requirements.txt
python app.py
Cloud Usage
Available on Hugging Face Spaces with instant access - no installation required.
API Integration
Phramer AI provides a simple API for integration into existing workflows:
from phramer import PhramerlAI
phramer = PhramerAI()
prompt, metadata = phramer.analyze_image("path/to/image.jpg")
print(f"Generated prompt: {prompt}")
Performance
- Average Processing Time: 2-4 seconds per image
- Supported Image Size: Up to 1024x1024 pixels
- Batch Processing: Multiple images with queue management
- Memory Optimization: Automatic cleanup and resource management
Roadmap
Version 2.1 (Coming Soon)
- Video frame analysis
- Batch processing improvements
- Additional engine-specific optimizations
- Enhanced cinematic analysis
Version 2.2 (Planned)
- Style transfer integration
- Custom knowledge base training
- API rate limiting and authentication
- Advanced composition analysis
Technical Details
Model Architecture
- Bagel-7B Base: Advanced vision-language model
- Custom Training: Optimized for prompt generation
- Knowledge Integration: Professional photography database
- Multi-Modal Processing: Image + text understanding
Optimization Features
- Memory Efficient: Automatic resource management
- GPU Acceleration: CUDA optimization when available
- Batch Processing: Multiple image support
- Error Handling: Robust fallback systems
Contributing
We welcome contributions to improve Phramer AI:
- Fork the repository
- Create a feature branch
- Submit a pull request with detailed description
- Follow coding standards and include tests
License
Apache 2.0 - See LICENSE file for details.
Support
For technical support, feature requests, or collaboration inquiries:
- Technical Issues: Create an issue in the repository
- Feature Requests: Submit detailed proposals
- Commercial Licensing: Contact Pariente AI
- MIA TV Series Integration: Production team coordination
Credits
Phramer AI is developed by Pariente AI specifically for the MIA TV Series production pipeline.
Core Technologies
- Bagel-7B multimodal architecture
- Professional photography knowledge base
- Advanced prompt optimization algorithms
- Multi-engine compatibility layer
Research & Development
- Pariente AI - Advanced multimodal AI research
- MIA TV Series - Creative pipeline integration
- Professional Photography Consultants - 30+ years expertise database
- Community Contributors - Feature improvements and testing
Pariente AI β’ Advanced Multimodal AI Research & Development β’ MIA TV Series
Bridging the gap between image understanding and generative prompting