Phramer_AI / README.md
Malaji71's picture
Update README.md
2f71b86 verified

A newer version of the Gradio SDK is available: 5.42.0

Upgrade
metadata
title: Phramer AI
emoji: 🎬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.33.2
app_file: app.py
pinned: false
license: apache-2.0
tags:
  - multimodal
  - image-to-prompt
  - flux
  - midjourney
  - generative-ai
  - computer-vision
  - cinematic
  - photography
  - bagel
  - pariente-ai

Phramer AI

By Pariente AI, for MIA TV Series

Logline: Phramer AI is a multimodal tool that reads an image and turns it into a refined, photo-realistic prompt. Ready for Midjourney, Flux or any generative engine.

Overview

Phramer AI is an advanced multimodal system developed by Pariente AI for the MIA TV Series creative pipeline.

Upload any image, and Phramer AI will:

  • Analyze it deeply using a custom Bagel architecture
  • Generate a detailed semantic-visual description
  • Enhance it using a curated photographic knowledge base
  • Output a structured prompt with camera settings, composition hints, mood, and style β€” ready for Flux or other diffusion-based platforms

Whether you're creating cinematic storyboards, photorealistic scenes, or exploring visual concepts, Phramer AI bridges the gap between image understanding and generative prompting.

Key Features

πŸ” Deep Multimodal Analysis

  • Custom Bagel-7B architecture for advanced image understanding
  • Semantic-visual analysis with professional photography insights
  • Context-aware scene detection and composition analysis

🎯 Multi-Engine Optimization

  • Flux-ready prompts with technical specifications
  • Midjourney compatibility with style and mood descriptors
  • Universal format compatible with major generative engines

πŸ“Έ Professional Photography Knowledge

  • Curated database of camera settings and equipment
  • Lighting techniques and composition principles
  • Technical parameters optimized for photorealistic output

🎬 Cinematic Focus

  • Designed for TV series and film production workflows
  • Storyboard and concept art optimization
  • Dramatic lighting and mood analysis

How It Works

  1. Image Upload - Support for JPG, PNG, WebP formats up to 1024px
  2. Bagel Analysis - Custom architecture analyzes visual content and composition
  3. Knowledge Enhancement - Professional photography database enriches the analysis
  4. Prompt Generation - Structured output with technical details and artistic direction
  5. Multi-Engine Ready - Copy and use in Flux, Midjourney, or any diffusion platform

Technical Specifications

Architecture

  • Base Model: Custom Bagel-7B multimodal architecture
  • Vision Processing: Advanced semantic-visual understanding
  • Knowledge Integration: Professional photography database with 30+ years expertise
  • Output Optimization: Multi-engine compatibility layer

Processing Pipeline

  • Image Preprocessing: Automatic optimization and format conversion
  • Multimodal Analysis: Deep scene understanding with technical assessment
  • Professional Enhancement: Camera, lighting, and composition recommendations
  • Prompt Structuring: Organized output with technical and artistic elements

Supported Platforms

  • Flux - Primary optimization target with technical specifications
  • Midjourney - Style and mood descriptors
  • Stable Diffusion - Technical parameter integration
  • Other Engines - Universal prompt format compatibility

Use Cases

🎬 Film & TV Production

  • Storyboard creation and visualization
  • Concept art development
  • Scene planning and mood reference
  • Visual consistency across episodes

πŸ“Έ Photography Reference

  • Lighting setup recreation
  • Camera configuration guidance
  • Composition analysis and improvement
  • Technical parameter optimization

🎨 Creative Development

  • Visual concept exploration
  • Style reference generation
  • Mood and atmosphere studies
  • Character and environment design

πŸ’Ό Commercial Applications

  • Product visualization
  • Marketing material creation
  • Brand consistency maintenance
  • Commercial photography planning

Example Workflow

Input: Portrait photograph of a person in dramatic lighting

Phramer AI Analysis:
β”œβ”€β”€ Scene Detection: Studio portrait with dramatic side lighting
β”œβ”€β”€ Technical Analysis: Professional setup with controlled lighting
β”œβ”€β”€ Camera Recommendation: Canon EOS R5 with 85mm f/1.4 lens
└── Enhancement: Cinematic mood with film-quality specifications

Output Prompt:
"A cinematic portrait of [subject description], shot on Canon EOS R5 
with 85mm f/1.4 lens at f/2.8, dramatic side lighting with subtle rim 
light, professional studio setup, film grain, photorealistic, 
ultra-detailed, commercial photography style"

Quality Scoring

Phramer AI evaluates generated prompts across multiple dimensions:

  • Prompt Quality (25%) - Content detail and description accuracy
  • Technical Details (25%) - Camera settings and equipment specifications
  • Professional Photography (25%) - Lighting, composition, and technical expertise
  • Multi-Engine Optimization (25%) - Compatibility and enhancement features

Scores range from 0-100 with grades from POOR to LEGENDARY.

Installation & Usage

Requirements

  • Python 3.8+
  • CUDA-compatible GPU (recommended)
  • 8GB+ RAM
  • Internet connection for model access

Local Setup

git clone [repository-url]
cd phramer-ai
pip install -r requirements.txt
python app.py

Cloud Usage

Available on Hugging Face Spaces with instant access - no installation required.

API Integration

Phramer AI provides a simple API for integration into existing workflows:

from phramer import PhramerlAI

phramer = PhramerAI()
prompt, metadata = phramer.analyze_image("path/to/image.jpg")
print(f"Generated prompt: {prompt}")

Performance

  • Average Processing Time: 2-4 seconds per image
  • Supported Image Size: Up to 1024x1024 pixels
  • Batch Processing: Multiple images with queue management
  • Memory Optimization: Automatic cleanup and resource management

Roadmap

Version 2.1 (Coming Soon)

  • Video frame analysis
  • Batch processing improvements
  • Additional engine-specific optimizations
  • Enhanced cinematic analysis

Version 2.2 (Planned)

  • Style transfer integration
  • Custom knowledge base training
  • API rate limiting and authentication
  • Advanced composition analysis

Technical Details

Model Architecture

  • Bagel-7B Base: Advanced vision-language model
  • Custom Training: Optimized for prompt generation
  • Knowledge Integration: Professional photography database
  • Multi-Modal Processing: Image + text understanding

Optimization Features

  • Memory Efficient: Automatic resource management
  • GPU Acceleration: CUDA optimization when available
  • Batch Processing: Multiple image support
  • Error Handling: Robust fallback systems

Contributing

We welcome contributions to improve Phramer AI:

  1. Fork the repository
  2. Create a feature branch
  3. Submit a pull request with detailed description
  4. Follow coding standards and include tests

License

Apache 2.0 - See LICENSE file for details.

Support

For technical support, feature requests, or collaboration inquiries:

  • Technical Issues: Create an issue in the repository
  • Feature Requests: Submit detailed proposals
  • Commercial Licensing: Contact Pariente AI
  • MIA TV Series Integration: Production team coordination

Credits

Phramer AI is developed by Pariente AI specifically for the MIA TV Series production pipeline.

Core Technologies

  • Bagel-7B multimodal architecture
  • Professional photography knowledge base
  • Advanced prompt optimization algorithms
  • Multi-engine compatibility layer

Research & Development

  • Pariente AI - Advanced multimodal AI research
  • MIA TV Series - Creative pipeline integration
  • Professional Photography Consultants - 30+ years expertise database
  • Community Contributors - Feature improvements and testing

Pariente AI β€’ Advanced Multimodal AI Research & Development β€’ MIA TV Series

Bridging the gap between image understanding and generative prompting