TokenCompose_SD14_A / README.md

Update README.md

d6016ce almost 2 years ago

785 Bytes

metadata

license: creativeml-openrail-m
language:
  - en
library_name: diffusers
pipeline_tag: text-to-image
tags:
  - stable-diffusion

TokenCompose SD14 Model Card

TokenCompose_SD14_A is a latent text-to-image diffusion model finetuned from the Stable-Diffusion-v1-4 checkpoint at resolution 512x512 on the VSR split of COCO image-caption pairs for 24,000 steps with a learning rate of 5e-6. The training objective involves token-level grounding terms in addition to denoising loss for enhanced multi-category instance composition and photorealism.