Spaces:

OPTML-Group
/

UnlearnCanvas-Benchmark

Running

App Files Files Community

UnlearnCanvas-Benchmark / src /display /about.py

DamonDemon

refine

afe1fbb about 1 year ago

raw

history blame

4.12 kB

	# from src.display.utils import ModelType

	TITLE = """
	<h1 id="space-title">UnlearnCanvas: A Stylized Image Dataset to Benchmark <br> Machine Unlearning for Diffusion Models</h1>"""

	INTRODUCTION_TEXT = """

	paper: https://arxiv.org/abs/2402.11846

	The rapid advancement of diffusion models (DMs) has not only transformed various real- world industries but has also introduced negative societal concerns, including the generation of harmful content, copyright disputes, and the rise of stereotypes and biases. To mitigate these issues, machine unlearning (MU) has emerged as a potential solution, demonstrating its ability to remove undesired generative capabilities of DMs in various applications. However, by examining existing MU evaluation methods, we uncover several key challenges that can result in incomplete, inaccurate, or biased evaluations for MU in DMs.

	To address them, we enhance the evaluation metrics for MU, including the introduction of an often-overlooked retainability measurement for DMs post-unlearning. Additionally, we introduce UnlearnCanvas, a comprehensive high-resolution stylized image dataset that facilitates us to evaluate the unlearning of artistic painting styles in conjunction with associated image objects.

	We show that this dataset plays a pivotal role in establishing a standardized and automated evaluation framework for MU techniques on DMs, featuring 7 quantitative metrics to address various aspects of unlearning effectiveness. Through extensive experiments, we benchmark 5 state-of- the-art MU methods, revealing novel insights into their pros and cons, and the underlying unlearning mechanisms. Furthermore, we demonstrate the potential of UnlearnCanvas to benchmark other generative modeling tasks, such as style transfer. The UnlearnCanvas dataset, benchmark, and the codes to reproduce all the results in this work can be found at https://github.com/OPTML-Group/UnlearnCanvas.

	"""

	LLM_BENCHMARKS_TEXT = f"""
	# Context

	## How it works

	We evaluate LLMs on 10 widely recognized game-theoretic tasks, including

	- <a href="https://en.wikipedia.org/wiki/Tic-tac-toe" target="_blank"> Tic-Tac-Toe</a>
	- <a href="https://en.wikipedia.org/wiki/Connect_Four" target="_blank"> Connect-4 </a>
	- <a href="https://en.wikipedia.org/wiki/Breakthrough_(board_game)" target="_blank"> Breakthrough</a>
	- <a href="https://en.wikipedia.org/wiki/Nim" target="_blank"> Nim</a>
	- <a href="https://en.wikipedia.org/wiki/First-price_sealed-bid_auction" target="_blank"> Blind Auction</a>
	- <a href="https://en.wikipedia.org/wiki/Kuhn_poker" target="_blank"> Kuhn Poker</a>
	- <a href="https://en.wikipedia.org/wiki/Liar\%27s_dice" target="_blank"> Liar's Dice</a>
	- <a href="https://arxiv.org/pdf/1706.05125.pdf" target="_blank"> Negotiation</a>
	- <a href="https://en.wikipedia.org/wiki/Pig_(dice_game)" target="_blank"> Pig</a>
	- <a href="https://en.wikipedia.org/wiki/Prisoner\%27s_dilemma" target="_blank"> Prisoner's Dilemma</a>

	## Metric
	We use Normalized Relative Advantage (NRA) to evaluation the performance of LLM agents. NRA(agent1, agent2) > 0 means agent1 has higher win rate/earn more rewards than the opponent agent2.

	Please refer to GTBench paper for more detail.

	## Takeaways
	- LLM agents failed in the complete-information and deterministic games
	- LLM agents are competitive in the probabilistic games
	- CodePretraining benefits game-theoretic tasks.
	- Advanced Reasoning Methods Do Not Always Help.

	## Contact
	Please feel free to contact Jinhao <[email protected]> and Renming <[email protected]> if you have any questions.

	"""

	FAQ_TEXT = """
	"""


	EVALUATION_QUEUE_TEXT = """
	# Evaluation for the GTBench leaderboard

	"""

	CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"

	CITATION_BUTTON_TEXT = r"""
	@article{zhang2024unlearncanvas,
	title={UnlearnCanvas: A Stylized Image Dataset to Benchmark Machine Unlearning for Diffusion Models},
	author={Zhang, Yihua and Zhang, Yimeng and Yao, Yuguang and Jia, Jinghan and Liu, Jiancheng and Liu, Xiaoming and Liu, Sijia},
	journal={arXiv preprint arXiv:2402.11846},
	year={2024}
	}
	"""