Spaces:
Running
on
Zero
Running
on
Zero
title: Phramer AI | |
emoji: 🎬 | |
colorFrom: blue | |
colorTo: purple | |
sdk: gradio | |
sdk_version: 5.33.2 | |
app_file: app.py | |
pinned: false | |
license: apache-2.0 | |
tags: | |
- multimodal | |
- image-to-prompt | |
- flux | |
- midjourney | |
- generative-ai | |
- computer-vision | |
- cinematic | |
- photography | |
- bagel | |
- pariente-ai | |
# Phramer AI | |
*By Pariente AI, for MIA TV Series* | |
**Logline:** Phramer AI is a multimodal tool that reads an image and turns it into a refined, photo-realistic prompt. Ready for Midjourney, Flux or any generative engine. | |
## Overview | |
**Phramer AI** is an advanced multimodal system developed by **Pariente AI** for the **MIA TV Series** creative pipeline. | |
Upload any image, and Phramer AI will: | |
- **Analyze it deeply** using a custom Bagel architecture | |
- **Generate a detailed semantic-visual description** | |
- **Enhance it** using a curated photographic knowledge base | |
- **Output a structured prompt** with camera settings, composition hints, mood, and style — ready for **Flux** or other diffusion-based platforms | |
Whether you're creating cinematic storyboards, photorealistic scenes, or exploring visual concepts, Phramer AI bridges the gap between image understanding and generative prompting. | |
## Key Features | |
### 🔍 **Deep Multimodal Analysis** | |
- Custom Bagel-7B architecture for advanced image understanding | |
- Semantic-visual analysis with professional photography insights | |
- Context-aware scene detection and composition analysis | |
### 🎯 **Multi-Engine Optimization** | |
- **Flux-ready prompts** with technical specifications | |
- **Midjourney compatibility** with style and mood descriptors | |
- **Universal format** compatible with major generative engines | |
### 📸 **Professional Photography Knowledge** | |
- Curated database of camera settings and equipment | |
- Lighting techniques and composition principles | |
- Technical parameters optimized for photorealistic output | |
### 🎬 **Cinematic Focus** | |
- Designed for TV series and film production workflows | |
- Storyboard and concept art optimization | |
- Dramatic lighting and mood analysis | |
## How It Works | |
1. **Image Upload** - Support for JPG, PNG, WebP formats up to 1024px | |
2. **Bagel Analysis** - Custom architecture analyzes visual content and composition | |
3. **Knowledge Enhancement** - Professional photography database enriches the analysis | |
4. **Prompt Generation** - Structured output with technical details and artistic direction | |
5. **Multi-Engine Ready** - Copy and use in Flux, Midjourney, or any diffusion platform | |
## Technical Specifications | |
### Architecture | |
- **Base Model**: Custom Bagel-7B multimodal architecture | |
- **Vision Processing**: Advanced semantic-visual understanding | |
- **Knowledge Integration**: Professional photography database with 30+ years expertise | |
- **Output Optimization**: Multi-engine compatibility layer | |
### Processing Pipeline | |
- **Image Preprocessing**: Automatic optimization and format conversion | |
- **Multimodal Analysis**: Deep scene understanding with technical assessment | |
- **Professional Enhancement**: Camera, lighting, and composition recommendations | |
- **Prompt Structuring**: Organized output with technical and artistic elements | |
### Supported Platforms | |
- **Flux** - Primary optimization target with technical specifications | |
- **Midjourney** - Style and mood descriptors | |
- **Stable Diffusion** - Technical parameter integration | |
- **Other Engines** - Universal prompt format compatibility | |
## Use Cases | |
### 🎬 **Film & TV Production** | |
- Storyboard creation and visualization | |
- Concept art development | |
- Scene planning and mood reference | |
- Visual consistency across episodes | |
### 📸 **Photography Reference** | |
- Lighting setup recreation | |
- Camera configuration guidance | |
- Composition analysis and improvement | |
- Technical parameter optimization | |
### 🎨 **Creative Development** | |
- Visual concept exploration | |
- Style reference generation | |
- Mood and atmosphere studies | |
- Character and environment design | |
### 💼 **Commercial Applications** | |
- Product visualization | |
- Marketing material creation | |
- Brand consistency maintenance | |
- Commercial photography planning | |
## Example Workflow | |
``` | |
Input: Portrait photograph of a person in dramatic lighting | |
Phramer AI Analysis: | |
├── Scene Detection: Studio portrait with dramatic side lighting | |
├── Technical Analysis: Professional setup with controlled lighting | |
├── Camera Recommendation: Canon EOS R5 with 85mm f/1.4 lens | |
└── Enhancement: Cinematic mood with film-quality specifications | |
Output Prompt: | |
"A cinematic portrait of [subject description], shot on Canon EOS R5 | |
with 85mm f/1.4 lens at f/2.8, dramatic side lighting with subtle rim | |
light, professional studio setup, film grain, photorealistic, | |
ultra-detailed, commercial photography style" | |
``` | |
## Quality Scoring | |
Phramer AI evaluates generated prompts across multiple dimensions: | |
- **Prompt Quality** (25%) - Content detail and description accuracy | |
- **Technical Details** (25%) - Camera settings and equipment specifications | |
- **Professional Photography** (25%) - Lighting, composition, and technical expertise | |
- **Multi-Engine Optimization** (25%) - Compatibility and enhancement features | |
Scores range from 0-100 with grades from POOR to LEGENDARY. | |
## Installation & Usage | |
### Requirements | |
- Python 3.8+ | |
- CUDA-compatible GPU (recommended) | |
- 8GB+ RAM | |
- Internet connection for model access | |
### Local Setup | |
```bash | |
git clone [repository-url] | |
cd phramer-ai | |
pip install -r requirements.txt | |
python app.py | |
``` | |
### Cloud Usage | |
Available on Hugging Face Spaces with instant access - no installation required. | |
## API Integration | |
Phramer AI provides a simple API for integration into existing workflows: | |
```python | |
from phramer import PhramerlAI | |
phramer = PhramerAI() | |
prompt, metadata = phramer.analyze_image("path/to/image.jpg") | |
print(f"Generated prompt: {prompt}") | |
``` | |
## Performance | |
- **Average Processing Time**: 2-4 seconds per image | |
- **Supported Image Size**: Up to 1024x1024 pixels | |
- **Batch Processing**: Multiple images with queue management | |
- **Memory Optimization**: Automatic cleanup and resource management | |
## Roadmap | |
### Version 2.1 (Coming Soon) | |
- Video frame analysis | |
- Batch processing improvements | |
- Additional engine-specific optimizations | |
- Enhanced cinematic analysis | |
### Version 2.2 (Planned) | |
- Style transfer integration | |
- Custom knowledge base training | |
- API rate limiting and authentication | |
- Advanced composition analysis | |
## Technical Details | |
### Model Architecture | |
- **Bagel-7B Base**: Advanced vision-language model | |
- **Custom Training**: Optimized for prompt generation | |
- **Knowledge Integration**: Professional photography database | |
- **Multi-Modal Processing**: Image + text understanding | |
### Optimization Features | |
- **Memory Efficient**: Automatic resource management | |
- **GPU Acceleration**: CUDA optimization when available | |
- **Batch Processing**: Multiple image support | |
- **Error Handling**: Robust fallback systems | |
## Contributing | |
We welcome contributions to improve Phramer AI: | |
1. Fork the repository | |
2. Create a feature branch | |
3. Submit a pull request with detailed description | |
4. Follow coding standards and include tests | |
## License | |
Apache 2.0 - See LICENSE file for details. | |
## Support | |
For technical support, feature requests, or collaboration inquiries: | |
- **Technical Issues**: Create an issue in the repository | |
- **Feature Requests**: Submit detailed proposals | |
- **Commercial Licensing**: Contact Pariente AI | |
- **MIA TV Series Integration**: Production team coordination | |
## Credits | |
**Phramer AI** is developed by **Pariente AI** specifically for the **MIA TV Series** production pipeline. | |
### Core Technologies | |
- Bagel-7B multimodal architecture | |
- Professional photography knowledge base | |
- Advanced prompt optimization algorithms | |
- Multi-engine compatibility layer | |
### Research & Development | |
- **Pariente AI** - Advanced multimodal AI research | |
- **MIA TV Series** - Creative pipeline integration | |
- **Professional Photography Consultants** - 30+ years expertise database | |
- **Community Contributors** - Feature improvements and testing | |
--- | |
**Pariente AI** • Advanced Multimodal AI Research & Development • **MIA TV Series** | |
*Bridging the gap between image understanding and generative prompting* |