Phramer_AI / README.md
Malaji71's picture
Update README.md
2f71b86 verified
---
title: Phramer AI
emoji: 🎬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.33.2
app_file: app.py
pinned: false
license: apache-2.0
tags:
- multimodal
- image-to-prompt
- flux
- midjourney
- generative-ai
- computer-vision
- cinematic
- photography
- bagel
- pariente-ai
---
# Phramer AI
*By Pariente AI, for MIA TV Series*
**Logline:** Phramer AI is a multimodal tool that reads an image and turns it into a refined, photo-realistic prompt. Ready for Midjourney, Flux or any generative engine.
## Overview
**Phramer AI** is an advanced multimodal system developed by **Pariente AI** for the **MIA TV Series** creative pipeline.
Upload any image, and Phramer AI will:
- **Analyze it deeply** using a custom Bagel architecture
- **Generate a detailed semantic-visual description**
- **Enhance it** using a curated photographic knowledge base
- **Output a structured prompt** with camera settings, composition hints, mood, and style — ready for **Flux** or other diffusion-based platforms
Whether you're creating cinematic storyboards, photorealistic scenes, or exploring visual concepts, Phramer AI bridges the gap between image understanding and generative prompting.
## Key Features
### 🔍 **Deep Multimodal Analysis**
- Custom Bagel-7B architecture for advanced image understanding
- Semantic-visual analysis with professional photography insights
- Context-aware scene detection and composition analysis
### 🎯 **Multi-Engine Optimization**
- **Flux-ready prompts** with technical specifications
- **Midjourney compatibility** with style and mood descriptors
- **Universal format** compatible with major generative engines
### 📸 **Professional Photography Knowledge**
- Curated database of camera settings and equipment
- Lighting techniques and composition principles
- Technical parameters optimized for photorealistic output
### 🎬 **Cinematic Focus**
- Designed for TV series and film production workflows
- Storyboard and concept art optimization
- Dramatic lighting and mood analysis
## How It Works
1. **Image Upload** - Support for JPG, PNG, WebP formats up to 1024px
2. **Bagel Analysis** - Custom architecture analyzes visual content and composition
3. **Knowledge Enhancement** - Professional photography database enriches the analysis
4. **Prompt Generation** - Structured output with technical details and artistic direction
5. **Multi-Engine Ready** - Copy and use in Flux, Midjourney, or any diffusion platform
## Technical Specifications
### Architecture
- **Base Model**: Custom Bagel-7B multimodal architecture
- **Vision Processing**: Advanced semantic-visual understanding
- **Knowledge Integration**: Professional photography database with 30+ years expertise
- **Output Optimization**: Multi-engine compatibility layer
### Processing Pipeline
- **Image Preprocessing**: Automatic optimization and format conversion
- **Multimodal Analysis**: Deep scene understanding with technical assessment
- **Professional Enhancement**: Camera, lighting, and composition recommendations
- **Prompt Structuring**: Organized output with technical and artistic elements
### Supported Platforms
- **Flux** - Primary optimization target with technical specifications
- **Midjourney** - Style and mood descriptors
- **Stable Diffusion** - Technical parameter integration
- **Other Engines** - Universal prompt format compatibility
## Use Cases
### 🎬 **Film & TV Production**
- Storyboard creation and visualization
- Concept art development
- Scene planning and mood reference
- Visual consistency across episodes
### 📸 **Photography Reference**
- Lighting setup recreation
- Camera configuration guidance
- Composition analysis and improvement
- Technical parameter optimization
### 🎨 **Creative Development**
- Visual concept exploration
- Style reference generation
- Mood and atmosphere studies
- Character and environment design
### 💼 **Commercial Applications**
- Product visualization
- Marketing material creation
- Brand consistency maintenance
- Commercial photography planning
## Example Workflow
```
Input: Portrait photograph of a person in dramatic lighting
Phramer AI Analysis:
├── Scene Detection: Studio portrait with dramatic side lighting
├── Technical Analysis: Professional setup with controlled lighting
├── Camera Recommendation: Canon EOS R5 with 85mm f/1.4 lens
└── Enhancement: Cinematic mood with film-quality specifications
Output Prompt:
"A cinematic portrait of [subject description], shot on Canon EOS R5
with 85mm f/1.4 lens at f/2.8, dramatic side lighting with subtle rim
light, professional studio setup, film grain, photorealistic,
ultra-detailed, commercial photography style"
```
## Quality Scoring
Phramer AI evaluates generated prompts across multiple dimensions:
- **Prompt Quality** (25%) - Content detail and description accuracy
- **Technical Details** (25%) - Camera settings and equipment specifications
- **Professional Photography** (25%) - Lighting, composition, and technical expertise
- **Multi-Engine Optimization** (25%) - Compatibility and enhancement features
Scores range from 0-100 with grades from POOR to LEGENDARY.
## Installation & Usage
### Requirements
- Python 3.8+
- CUDA-compatible GPU (recommended)
- 8GB+ RAM
- Internet connection for model access
### Local Setup
```bash
git clone [repository-url]
cd phramer-ai
pip install -r requirements.txt
python app.py
```
### Cloud Usage
Available on Hugging Face Spaces with instant access - no installation required.
## API Integration
Phramer AI provides a simple API for integration into existing workflows:
```python
from phramer import PhramerlAI
phramer = PhramerAI()
prompt, metadata = phramer.analyze_image("path/to/image.jpg")
print(f"Generated prompt: {prompt}")
```
## Performance
- **Average Processing Time**: 2-4 seconds per image
- **Supported Image Size**: Up to 1024x1024 pixels
- **Batch Processing**: Multiple images with queue management
- **Memory Optimization**: Automatic cleanup and resource management
## Roadmap
### Version 2.1 (Coming Soon)
- Video frame analysis
- Batch processing improvements
- Additional engine-specific optimizations
- Enhanced cinematic analysis
### Version 2.2 (Planned)
- Style transfer integration
- Custom knowledge base training
- API rate limiting and authentication
- Advanced composition analysis
## Technical Details
### Model Architecture
- **Bagel-7B Base**: Advanced vision-language model
- **Custom Training**: Optimized for prompt generation
- **Knowledge Integration**: Professional photography database
- **Multi-Modal Processing**: Image + text understanding
### Optimization Features
- **Memory Efficient**: Automatic resource management
- **GPU Acceleration**: CUDA optimization when available
- **Batch Processing**: Multiple image support
- **Error Handling**: Robust fallback systems
## Contributing
We welcome contributions to improve Phramer AI:
1. Fork the repository
2. Create a feature branch
3. Submit a pull request with detailed description
4. Follow coding standards and include tests
## License
Apache 2.0 - See LICENSE file for details.
## Support
For technical support, feature requests, or collaboration inquiries:
- **Technical Issues**: Create an issue in the repository
- **Feature Requests**: Submit detailed proposals
- **Commercial Licensing**: Contact Pariente AI
- **MIA TV Series Integration**: Production team coordination
## Credits
**Phramer AI** is developed by **Pariente AI** specifically for the **MIA TV Series** production pipeline.
### Core Technologies
- Bagel-7B multimodal architecture
- Professional photography knowledge base
- Advanced prompt optimization algorithms
- Multi-engine compatibility layer
### Research & Development
- **Pariente AI** - Advanced multimodal AI research
- **MIA TV Series** - Creative pipeline integration
- **Professional Photography Consultants** - 30+ years expertise database
- **Community Contributors** - Feature improvements and testing
---
**Pariente AI** • Advanced Multimodal AI Research & Development • **MIA TV Series**
*Bridging the gap between image understanding and generative prompting*