text3d-r1 / README.md
ginipick's picture
Update README.md
6206d0b verified
|
raw
history blame
3.8 kB
metadata
title: 3D Style Image Gen R1
emoji: ๐Ÿ–ผ๐Ÿ†
colorFrom: purple
colorTo: red
sdk: gradio
sdk_version: 5.35.0
app_file: app.py
pinned: false
license: openrail++
short_description: '3D Style Image Generator R1: Fast & High Quality Mode'

I'll analyze this code and provide explanations in both English and Korean.

English Explanation

This is a 3D Style Image Generator application built with Gradio and Hugging Face's Diffusers library. Here's what it does:

Key Features:

  1. Image Generation: Uses FLUX.1-dev model with Hyper-SD LoRA for fast, high-quality 3D-style image generation
  2. Korean Translation: Automatically detects and translates Korean prompts to English using Helsinki-NLP translator
  3. Web Interface: Clean Gradio interface with customizable generation parameters
  4. Gallery Display: Shows pre-generated sample images with their prompts

Technical Components:

  • Model: FLUX.1-dev with Hyper-SD 8-step LoRA for accelerated inference
  • GPU Acceleration: Uses CUDA with bfloat16 precision for efficiency
  • Caching: Implements local model caching to avoid repeated downloads
  • Image Saving: Automatically saves generated images with timestamps

User Controls:

  • Prompt Input: Text description for the desired 3D image
  • Advanced Settings:
    • Image dimensions (256-1152 pixels)
    • Inference steps (6-25 steps)
    • Guidance scale (0.0-5.0)
    • Seed control for reproducibility

Workflow:

  1. User enters a prompt (Korean or English)
  2. Korean prompts are automatically translated
  3. Prompt is formatted with "wbgmsst, 3D, [prompt], white background"
  4. Model generates the image using specified parameters
  5. Image is displayed and saved to gallery

ํ•œ๊ธ€ ์„ค๋ช…

์ด๊ฒƒ์€ Gradio์™€ Hugging Face์˜ Diffusers ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ตฌ์ถ•๋œ 3D ์Šคํƒ€์ผ ์ด๋ฏธ์ง€ ์ƒ์„ฑ๊ธฐ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์ž…๋‹ˆ๋‹ค.

์ฃผ์š” ๊ธฐ๋Šฅ:

  1. ์ด๋ฏธ์ง€ ์ƒ์„ฑ: FLUX.1-dev ๋ชจ๋ธ๊ณผ Hyper-SD LoRA๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋น ๋ฅด๊ณ  ๊ณ ํ’ˆ์งˆ์˜ 3D ์Šคํƒ€์ผ ์ด๋ฏธ์ง€ ์ƒ์„ฑ
  2. ํ•œ๊ตญ์–ด ๋ฒˆ์—ญ: Helsinki-NLP ๋ฒˆ์—ญ๊ธฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•œ๊ตญ์–ด ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ž๋™์œผ๋กœ ๊ฐ์ง€ํ•˜๊ณ  ์˜์–ด๋กœ ๋ฒˆ์—ญ
  3. ์›น ์ธํ„ฐํŽ˜์ด์Šค: ์‚ฌ์šฉ์ž ์ •์˜ ๊ฐ€๋Šฅํ•œ ์ƒ์„ฑ ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ์žˆ๋Š” ๊น”๋”ํ•œ Gradio ์ธํ„ฐํŽ˜์ด์Šค
  4. ๊ฐค๋Ÿฌ๋ฆฌ ํ‘œ์‹œ: ๋ฏธ๋ฆฌ ์ƒ์„ฑ๋œ ์ƒ˜ํ”Œ ์ด๋ฏธ์ง€์™€ ํ•ด๋‹น ํ”„๋กฌํ”„ํŠธ ํ‘œ์‹œ

๊ธฐ์ˆ ์  ๊ตฌ์„ฑ์š”์†Œ:

  • ๋ชจ๋ธ: ๊ฐ€์†ํ™”๋œ ์ถ”๋ก ์„ ์œ„ํ•œ Hyper-SD 8๋‹จ๊ณ„ LoRA๊ฐ€ ์ ์šฉ๋œ FLUX.1-dev
  • GPU ๊ฐ€์†: ํšจ์œจ์„ฑ์„ ์œ„ํ•ด bfloat16 ์ •๋ฐ€๋„๋กœ CUDA ์‚ฌ์šฉ
  • ์บ์‹ฑ: ๋ฐ˜๋ณต ๋‹ค์šด๋กœ๋“œ๋ฅผ ํ”ผํ•˜๊ธฐ ์œ„ํ•œ ๋กœ์ปฌ ๋ชจ๋ธ ์บ์‹ฑ ๊ตฌํ˜„
  • ์ด๋ฏธ์ง€ ์ €์žฅ: ์ƒ์„ฑ๋œ ์ด๋ฏธ์ง€๋ฅผ ํƒ€์ž„์Šคํƒฌํ”„์™€ ํ•จ๊ป˜ ์ž๋™ ์ €์žฅ

์‚ฌ์šฉ์ž ์ œ์–ด ๊ธฐ๋Šฅ:

  • ํ”„๋กฌํ”„ํŠธ ์ž…๋ ฅ: ์›ํ•˜๋Š” 3D ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ํ…์ŠคํŠธ ์„ค๋ช…
  • ๊ณ ๊ธ‰ ์„ค์ •:
    • ์ด๋ฏธ์ง€ ํฌ๊ธฐ (256-1152 ํ”ฝ์…€)
    • ์ถ”๋ก  ๋‹จ๊ณ„ (6-25 ๋‹จ๊ณ„)
    • ๊ฐ€์ด๋˜์Šค ์Šค์ผ€์ผ (0.0-5.0)
    • ์žฌํ˜„์„ฑ์„ ์œ„ํ•œ ์‹œ๋“œ ์ œ์–ด

์ž‘๋™ ๊ณผ์ •:

  1. ์‚ฌ์šฉ์ž๊ฐ€ ํ”„๋กฌํ”„ํŠธ ์ž…๋ ฅ (ํ•œ๊ตญ์–ด ๋˜๋Š” ์˜์–ด)
  2. ํ•œ๊ตญ์–ด ํ”„๋กฌํ”„ํŠธ๋Š” ์ž๋™์œผ๋กœ ๋ฒˆ์—ญ๋จ
  3. ํ”„๋กฌํ”„ํŠธ๊ฐ€ "wbgmsst, 3D, [ํ”„๋กฌํ”„ํŠธ], white background" ํ˜•์‹์œผ๋กœ ํฌ๋งท๋จ
  4. ๋ชจ๋ธ์ด ์ง€์ •๋œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€ ์ƒ์„ฑ
  5. ์ด๋ฏธ์ง€๊ฐ€ ํ‘œ์‹œ๋˜๊ณ  ๊ฐค๋Ÿฌ๋ฆฌ์— ์ €์žฅ๋จ

ํŠน๋ณ„ ๊ธฐ๋Šฅ:

  • ํ•œ๊ตญ์–ด ์ง€์›: ํ•œ๊ตญ์–ด๋กœ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ž…๋ ฅํ•˜๋ฉด ์ž๋™์œผ๋กœ ์˜์–ด๋กœ ๋ฒˆ์—ญ๋˜์–ด ์ฒ˜๋ฆฌ
  • ๋น ๋ฅธ ์ƒ์„ฑ: Hyper-SD LoRA๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 8๋‹จ๊ณ„๋งŒ์œผ๋กœ๋„ ๊ณ ํ’ˆ์งˆ ์ด๋ฏธ์ง€ ์ƒ์„ฑ
  • ๊ฐค๋Ÿฌ๋ฆฌ: ๋‹ค์–‘ํ•œ ์Šคํƒ€์ผ์˜ ์ƒ˜ํ”Œ ์ด๋ฏธ์ง€๋ฅผ ์ฐธ๊ณ ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐค๋Ÿฌ๋ฆฌ ์ œ๊ณต
  • ์‹œ๋“œ ์ œ์–ด: ๋™์ผํ•œ ์ด๋ฏธ์ง€๋ฅผ ์žฌ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋„๋ก ์‹œ๋“œ ๊ฐ’ ์ œ์–ด ๊ฐ€๋Šฅ