apple
/

DiffuCoder-7B-Base

text-diffusion-model

diffusion large language model

Model card Files Files and versions Community

DiffuCoder-7B-Base / README.md

yizheapple's picture

Update README.md (#2)

9ad0bdc verified 8 days ago

|

history blame contribute delete

2.27 kB

	---
	license: unknown
	base_model:
	- Qwen/Qwen2.5-Coder-7B
	tags:
	- code
	- text-diffusion-model
	- diffusion large language model
	---

	### DiffuCoder-7B-Base

	The DiffuCoder-7B-Base model is our foundational masked diffusion LLM for code generation.

	- Training recipe: Using [DiffuLLaMA](https://github.com/HKUNLP/DiffuLLaMA)'s adaptation approach, trained on a large corpus of code: with Stage 1 65B tokens and Stage 2 65B tokens.

	- Benchmarks: Strong baseline performance on HumanEval, MBPP and BigCodeBench.

	#### More details and usage examples:

	- Paper: [DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation](https://arxiv.org/abs/2506.20639)

	- GitHub: https://github.com/apple/ml-diffucoder

	```
	import torch
	from transformers import AutoModel, AutoTokenizer

	model_path = "apple/DiffuCoder-7B-Base"
	model = AutoModel.from_pretrained(model_path, torch_dtype=torch.bfloat16, trust_remote_code=True)
	tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
	model = model.to("cuda").eval()

	prompt = """
	from typing import List

	def has_close_elements(numbers: List[float], threshold: float) -> bool:
	\"\"\"
	Check if in given list of numbers, are any two numbers closer to each other than given threshold.
	>>> has_close_elements([1.0, 2.0, 3.0], 0.5)
	False
	>>> has_close_elements([1.0, 2.8, 3.0, 4.0, 5.0, 2.0], 0.3)
	True
	\"\"\"
	"""

	TOKEN_PER_STEP = 1 # diffusion timesteps * TOKEN_PER_STEP = total new tokens

	inputs = tokenizer(prompt, return_tensors="pt")
	input_ids = inputs.input_ids.to(device="cuda")
	attention_mask = inputs.attention_mask.to(device="cuda")

	output = model.diffusion_generate(
	input_ids,
	attention_mask=attention_mask,
	max_new_tokens=256,
	output_history=True,
	return_dict_in_generate=True,
	steps=256//TOKEN_PER_STEP,
	temperature=0.2,
	top_p=0.95,
	alg="entropy",
	alg_temp=0.,
	)
	generations = [
	tokenizer.decode(g[len(p) :].tolist())
	for p, g in zip(input_ids, output.sequences)
	]

	print(generations[0].split(tokenizer.eos_token)[0])
	```

	#### Acknowledgement
	To power this HuggingFace model release, we reuse [Dream](https://huggingface.co/Dream-org/Dream-v0-Base-7B)'s modeling architecture and generation utils.