NIVEDAN commited on
Commit
cb18663
·
verified ·
1 Parent(s): bd384a3

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +115 -0
  2. config.yaml +62 -0
  3. wan2.1-14b-nivedan-lora.safetensors +3 -0
README.md ADDED
@@ -0,0 +1,115 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - zh
6
+ tags:
7
+ - image-to-video
8
+ - lora
9
+ - replicate
10
+ - text-to-video
11
+ - video
12
+ - video-generation
13
+ base_model: "Wan-AI/Wan2.1-T2V-14B-Diffusers"
14
+ pipeline_tag: text-to-video
15
+ # widget:
16
+ # - text: >-
17
+ # prompt
18
+ # output:
19
+ # url: https://...
20
+ instance_prompt: nivedan
21
+ ---
22
+
23
+ # Wan2.1 Lora
24
+
25
+ <Gallery />
26
+
27
+ ## About this LoRA
28
+
29
+ This is a [LoRA](https://replicate.com/docs/guides/working-with-loras) for the Wan2.1 14b video generation model.
30
+
31
+ It can be used with diffusers or ComfyUI, and can be loaded against both the text-to-video and image-to-video Wan2.1 models.
32
+
33
+ It was trained on [Replicate](https://replicate.com/) using AI toolkit: https://replicate.com/ostris/wan-lora-trainer/train
34
+
35
+
36
+ ## Trigger words
37
+
38
+ You should use `nivedan` to trigger the video generation.
39
+
40
+
41
+ ## Use this LoRA
42
+
43
+ Replicate has a collection of Wan2.1 models that are optimised for speed and cost. They can also be used with this LoRA:
44
+
45
+ - https://replicate.com/collections/wan-video
46
+ - https://replicate.com/fofr/wan2.1-with-lora
47
+
48
+ ### Run this LoRA with an API using Replicate
49
+
50
+ ```py
51
+ import replicate
52
+
53
+ input = {
54
+ "prompt": "nivedan",
55
+ "lora_url": "https://huggingface.co/NIVEDAN/wan2.1-lora/resolve/main/wan2.1-14b-nivedan-lora.safetensors"
56
+ }
57
+
58
+ output = replicate.run(
59
+ "fofr/wan2.1-with-lora:f83b84064136a38415a3aff66c326f94c66859b8ad7a2cb432e2822774f07b08",
60
+ model="14b",
61
+ input=input
62
+ )
63
+ for index, item in enumerate(output):
64
+ with open(f"output_{index}.mp4", "wb") as file:
65
+ file.write(item.read())
66
+ ```
67
+
68
+ ### Using with Diffusers
69
+
70
+ ```py
71
+ pip install git+https://github.com/huggingface/diffusers.git
72
+ ```
73
+
74
+ ```py
75
+ import torch
76
+ from diffusers.utils import export_to_video
77
+ from diffusers import AutoencoderKLWan, WanPipeline
78
+ from diffusers.schedulers.scheduling_unipc_multistep import UniPCMultistepScheduler
79
+
80
+ model_id = "Wan-AI/Wan2.1-T2V-14B-Diffusers"
81
+ vae = AutoencoderKLWan.from_pretrained(model_id, subfolder="vae", torch_dtype=torch.float32)
82
+ pipe = WanPipeline.from_pretrained(model_id, vae=vae, torch_dtype=torch.bfloat16)
83
+ flow_shift = 3.0 # 5.0 for 720P, 3.0 for 480P
84
+ pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config, flow_shift=flow_shift)
85
+ pipe.to("cuda")
86
+
87
+ pipe.load_lora_weights("NIVEDAN/wan2.1-lora")
88
+
89
+ pipe.enable_model_cpu_offload() #for low-vram environments
90
+
91
+ prompt = "nivedan"
92
+ negative_prompt = "Bright tones, overexposed, static, blurred details, subtitles, style, works, paintings, images, static, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, misshapen limbs, fused fingers, still picture, messy background, three legs, many people in the background, walking backwards"
93
+
94
+ output = pipe(
95
+ prompt=prompt,
96
+ negative_prompt=negative_prompt,
97
+ height=480,
98
+ width=832,
99
+ num_frames=81,
100
+ guidance_scale=5.0,
101
+ ).frames[0]
102
+ export_to_video(output, "output.mp4", fps=16)
103
+ ```
104
+
105
+
106
+ ## Training details
107
+
108
+ - Steps: 2201
109
+ - Learning rate: 0.0001
110
+ - LoRA rank: 40
111
+
112
+
113
+ ## Contribute your own examples
114
+
115
+ You can use the [community tab](https://huggingface.co/NIVEDAN/wan2.1-lora/discussions) to add videos that show off what you’ve made with this LoRA.
config.yaml ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ job: custom_job
2
+ config:
3
+ name: wan_train_replicate
4
+ process:
5
+ - type: custom_sd_trainer
6
+ training_folder: output
7
+ device: cuda:0
8
+ trigger_word: nivedan
9
+ network:
10
+ type: lora
11
+ linear: 40
12
+ linear_alpha: 40
13
+ save:
14
+ dtype: float16
15
+ save_every: 2202
16
+ max_step_saves_to_keep: 1
17
+ datasets:
18
+ - folder_path: input_images
19
+ caption_ext: txt
20
+ caption_dropout_rate: 0.05
21
+ shuffle_tokens: false
22
+ cache_latents_to_disk: false
23
+ cache_latents: true
24
+ resolution:
25
+ - 632
26
+ train:
27
+ batch_size: 1
28
+ steps: 2201
29
+ gradient_accumulation_steps: 1
30
+ train_unet: true
31
+ train_text_encoder: false
32
+ gradient_checkpointing: false
33
+ noise_scheduler: flowmatch
34
+ timestep_type: sigmoid
35
+ optimizer: adamw8bit
36
+ optimizer_params:
37
+ weight_decay: 0.0001
38
+ lr: 0.0001
39
+ ema_config:
40
+ use_ema: true
41
+ ema_decay: 0.99
42
+ dtype: bf16
43
+ model:
44
+ name_or_path: Wan-AI/Wan2.1-T2V-14B-Diffusers
45
+ quantize: false
46
+ arch: wan21
47
+ sample:
48
+ sampler: flowmatch
49
+ sample_every: 2202
50
+ width: 832
51
+ height: 480
52
+ num_frames: 33
53
+ fps: 16
54
+ prompts: []
55
+ neg: ''
56
+ seed: 42
57
+ walk_seed: true
58
+ guidance_scale: 5
59
+ sample_steps: 30
60
+ meta:
61
+ name: wan_train_replicate
62
+ version: '1.0'
wan2.1-14b-nivedan-lora.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e019a709a1e69ba9792054de85c8d7f557a8ea57364558d2dcf3cf3c856f5b41
3
+ size 383484864