text3d-r1 / README.md
ginipick's picture
Update README.md
6206d0b verified
|
raw
history blame
3.8 kB
---
title: 3D Style Image Gen R1
emoji: ๐Ÿ–ผ๐Ÿ†
colorFrom: purple
colorTo: red
sdk: gradio
sdk_version: 5.35.0
app_file: app.py
pinned: false
license: openrail++
short_description: '3D Style Image Generator R1: Fast & High Quality Mode'
---
I'll analyze this code and provide explanations in both English and Korean.
## English Explanation
This is a **3D Style Image Generator** application built with Gradio and Hugging Face's Diffusers library. Here's what it does:
### Key Features:
1. **Image Generation**: Uses FLUX.1-dev model with Hyper-SD LoRA for fast, high-quality 3D-style image generation
2. **Korean Translation**: Automatically detects and translates Korean prompts to English using Helsinki-NLP translator
3. **Web Interface**: Clean Gradio interface with customizable generation parameters
4. **Gallery Display**: Shows pre-generated sample images with their prompts
### Technical Components:
- **Model**: FLUX.1-dev with Hyper-SD 8-step LoRA for accelerated inference
- **GPU Acceleration**: Uses CUDA with bfloat16 precision for efficiency
- **Caching**: Implements local model caching to avoid repeated downloads
- **Image Saving**: Automatically saves generated images with timestamps
### User Controls:
- **Prompt Input**: Text description for the desired 3D image
- **Advanced Settings**:
- Image dimensions (256-1152 pixels)
- Inference steps (6-25 steps)
- Guidance scale (0.0-5.0)
- Seed control for reproducibility
### Workflow:
1. User enters a prompt (Korean or English)
2. Korean prompts are automatically translated
3. Prompt is formatted with "wbgmsst, 3D, [prompt], white background"
4. Model generates the image using specified parameters
5. Image is displayed and saved to gallery
---
## ํ•œ๊ธ€ ์„ค๋ช…
์ด๊ฒƒ์€ Gradio์™€ Hugging Face์˜ Diffusers ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ตฌ์ถ•๋œ **3D ์Šคํƒ€์ผ ์ด๋ฏธ์ง€ ์ƒ์„ฑ๊ธฐ** ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์ž…๋‹ˆ๋‹ค.
### ์ฃผ์š” ๊ธฐ๋Šฅ:
1. **์ด๋ฏธ์ง€ ์ƒ์„ฑ**: FLUX.1-dev ๋ชจ๋ธ๊ณผ Hyper-SD LoRA๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋น ๋ฅด๊ณ  ๊ณ ํ’ˆ์งˆ์˜ 3D ์Šคํƒ€์ผ ์ด๋ฏธ์ง€ ์ƒ์„ฑ
2. **ํ•œ๊ตญ์–ด ๋ฒˆ์—ญ**: Helsinki-NLP ๋ฒˆ์—ญ๊ธฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•œ๊ตญ์–ด ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ž๋™์œผ๋กœ ๊ฐ์ง€ํ•˜๊ณ  ์˜์–ด๋กœ ๋ฒˆ์—ญ
3. **์›น ์ธํ„ฐํŽ˜์ด์Šค**: ์‚ฌ์šฉ์ž ์ •์˜ ๊ฐ€๋Šฅํ•œ ์ƒ์„ฑ ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ์žˆ๋Š” ๊น”๋”ํ•œ Gradio ์ธํ„ฐํŽ˜์ด์Šค
4. **๊ฐค๋Ÿฌ๋ฆฌ ํ‘œ์‹œ**: ๋ฏธ๋ฆฌ ์ƒ์„ฑ๋œ ์ƒ˜ํ”Œ ์ด๋ฏธ์ง€์™€ ํ•ด๋‹น ํ”„๋กฌํ”„ํŠธ ํ‘œ์‹œ
### ๊ธฐ์ˆ ์  ๊ตฌ์„ฑ์š”์†Œ:
- **๋ชจ๋ธ**: ๊ฐ€์†ํ™”๋œ ์ถ”๋ก ์„ ์œ„ํ•œ Hyper-SD 8๋‹จ๊ณ„ LoRA๊ฐ€ ์ ์šฉ๋œ FLUX.1-dev
- **GPU ๊ฐ€์†**: ํšจ์œจ์„ฑ์„ ์œ„ํ•ด bfloat16 ์ •๋ฐ€๋„๋กœ CUDA ์‚ฌ์šฉ
- **์บ์‹ฑ**: ๋ฐ˜๋ณต ๋‹ค์šด๋กœ๋“œ๋ฅผ ํ”ผํ•˜๊ธฐ ์œ„ํ•œ ๋กœ์ปฌ ๋ชจ๋ธ ์บ์‹ฑ ๊ตฌํ˜„
- **์ด๋ฏธ์ง€ ์ €์žฅ**: ์ƒ์„ฑ๋œ ์ด๋ฏธ์ง€๋ฅผ ํƒ€์ž„์Šคํƒฌํ”„์™€ ํ•จ๊ป˜ ์ž๋™ ์ €์žฅ
### ์‚ฌ์šฉ์ž ์ œ์–ด ๊ธฐ๋Šฅ:
- **ํ”„๋กฌํ”„ํŠธ ์ž…๋ ฅ**: ์›ํ•˜๋Š” 3D ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ํ…์ŠคํŠธ ์„ค๋ช…
- **๊ณ ๊ธ‰ ์„ค์ •**:
- ์ด๋ฏธ์ง€ ํฌ๊ธฐ (256-1152 ํ”ฝ์…€)
- ์ถ”๋ก  ๋‹จ๊ณ„ (6-25 ๋‹จ๊ณ„)
- ๊ฐ€์ด๋˜์Šค ์Šค์ผ€์ผ (0.0-5.0)
- ์žฌํ˜„์„ฑ์„ ์œ„ํ•œ ์‹œ๋“œ ์ œ์–ด
### ์ž‘๋™ ๊ณผ์ •:
1. ์‚ฌ์šฉ์ž๊ฐ€ ํ”„๋กฌํ”„ํŠธ ์ž…๋ ฅ (ํ•œ๊ตญ์–ด ๋˜๋Š” ์˜์–ด)
2. ํ•œ๊ตญ์–ด ํ”„๋กฌํ”„ํŠธ๋Š” ์ž๋™์œผ๋กœ ๋ฒˆ์—ญ๋จ
3. ํ”„๋กฌํ”„ํŠธ๊ฐ€ "wbgmsst, 3D, [ํ”„๋กฌํ”„ํŠธ], white background" ํ˜•์‹์œผ๋กœ ํฌ๋งท๋จ
4. ๋ชจ๋ธ์ด ์ง€์ •๋œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€ ์ƒ์„ฑ
5. ์ด๋ฏธ์ง€๊ฐ€ ํ‘œ์‹œ๋˜๊ณ  ๊ฐค๋Ÿฌ๋ฆฌ์— ์ €์žฅ๋จ
### ํŠน๋ณ„ ๊ธฐ๋Šฅ:
- **ํ•œ๊ตญ์–ด ์ง€์›**: ํ•œ๊ตญ์–ด๋กœ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ž…๋ ฅํ•˜๋ฉด ์ž๋™์œผ๋กœ ์˜์–ด๋กœ ๋ฒˆ์—ญ๋˜์–ด ์ฒ˜๋ฆฌ
- **๋น ๋ฅธ ์ƒ์„ฑ**: Hyper-SD LoRA๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 8๋‹จ๊ณ„๋งŒ์œผ๋กœ๋„ ๊ณ ํ’ˆ์งˆ ์ด๋ฏธ์ง€ ์ƒ์„ฑ
- **๊ฐค๋Ÿฌ๋ฆฌ**: ๋‹ค์–‘ํ•œ ์Šคํƒ€์ผ์˜ ์ƒ˜ํ”Œ ์ด๋ฏธ์ง€๋ฅผ ์ฐธ๊ณ ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐค๋Ÿฌ๋ฆฌ ์ œ๊ณต
- **์‹œ๋“œ ์ œ์–ด**: ๋™์ผํ•œ ์ด๋ฏธ์ง€๋ฅผ ์žฌ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋„๋ก ์‹œ๋“œ ๊ฐ’ ์ œ์–ด ๊ฐ€๋Šฅ