Spaces:

ginigen
/

text3d-r1

Running on Zero

File size: 3,796 Bytes

343fdaf
32e518c
0e2318f
343fdaf
 
 
277c103
343fdaf
 
60fd701
277c103
343fdaf
6206d0b
343fdaf
6206d0b

---
title: 3D Style Image Gen R1
emoji: 🖼🏆
colorFrom: purple
colorTo: red
sdk: gradio
sdk_version: 5.35.0
app_file: app.py
pinned: false
license: openrail++
short_description: '3D Style Image Generator R1: Fast & High Quality Mode'
---
I'll analyze this code and provide explanations in both English and Korean.

## English Explanation

This is a **3D Style Image Generator** application built with Gradio and Hugging Face's Diffusers library. Here's what it does:

### Key Features:
1. **Image Generation**: Uses FLUX.1-dev model with Hyper-SD LoRA for fast, high-quality 3D-style image generation
2. **Korean Translation**: Automatically detects and translates Korean prompts to English using Helsinki-NLP translator
3. **Web Interface**: Clean Gradio interface with customizable generation parameters
4. **Gallery Display**: Shows pre-generated sample images with their prompts

### Technical Components:
- **Model**: FLUX.1-dev with Hyper-SD 8-step LoRA for accelerated inference
- **GPU Acceleration**: Uses CUDA with bfloat16 precision for efficiency
- **Caching**: Implements local model caching to avoid repeated downloads
- **Image Saving**: Automatically saves generated images with timestamps

### User Controls:
- **Prompt Input**: Text description for the desired 3D image
- **Advanced Settings**:
  - Image dimensions (256-1152 pixels)
  - Inference steps (6-25 steps)
  - Guidance scale (0.0-5.0)
  - Seed control for reproducibility

### Workflow:
1. User enters a prompt (Korean or English)
2. Korean prompts are automatically translated
3. Prompt is formatted with "wbgmsst, 3D, [prompt], white background"
4. Model generates the image using specified parameters
5. Image is displayed and saved to gallery

---

## 한글 설명

이것은 Gradio와 Hugging Face의 Diffusers 라이브러리를 사용하여 구축된 **3D 스타일 이미지 생성기** 애플리케이션입니다.

### 주요 기능:
1. **이미지 생성**: FLUX.1-dev 모델과 Hyper-SD LoRA를 사용하여 빠르고 고품질의 3D 스타일 이미지 생성
2. **한국어 번역**: Helsinki-NLP 번역기를 사용하여 한국어 프롬프트를 자동으로 감지하고 영어로 번역
3. **웹 인터페이스**: 사용자 정의 가능한 생성 매개변수가 있는 깔끔한 Gradio 인터페이스
4. **갤러리 표시**: 미리 생성된 샘플 이미지와 해당 프롬프트 표시

### 기술적 구성요소:
- **모델**: 가속화된 추론을 위한 Hyper-SD 8단계 LoRA가 적용된 FLUX.1-dev
- **GPU 가속**: 효율성을 위해 bfloat16 정밀도로 CUDA 사용
- **캐싱**: 반복 다운로드를 피하기 위한 로컬 모델 캐싱 구현
- **이미지 저장**: 생성된 이미지를 타임스탬프와 함께 자동 저장

### 사용자 제어 기능:
- **프롬프트 입력**: 원하는 3D 이미지에 대한 텍스트 설명
- **고급 설정**:
  - 이미지 크기 (256-1152 픽셀)
  - 추론 단계 (6-25 단계)
  - 가이던스 스케일 (0.0-5.0)
  - 재현성을 위한 시드 제어

### 작동 과정:
1. 사용자가 프롬프트 입력 (한국어 또는 영어)
2. 한국어 프롬프트는 자동으로 번역됨
3. 프롬프트가 "wbgmsst, 3D, [프롬프트], white background" 형식으로 포맷됨
4. 모델이 지정된 매개변수를 사용하여 이미지 생성
5. 이미지가 표시되고 갤러리에 저장됨

### 특별 기능:
- **한국어 지원**: 한국어로 프롬프트를 입력하면 자동으로 영어로 번역되어 처리
- **빠른 생성**: Hyper-SD LoRA를 사용하여 8단계만으로도 고품질 이미지 생성
- **갤러리**: 다양한 스타일의 샘플 이미지를 참고할 수 있는 갤러리 제공
- **시드 제어**: 동일한 이미지를 재생성할 수 있도록 시드 값 제어 가능