Running on CPU Upgrade 7.86k 7.86k Kolors Virtual Try-On ๐ Upload images to try on clothes virtually
Running on Zero 1.38k 1.38k FLUX LoRa the Explorer ๐ Generate images based on prompts and LoRA models
view post Post 2088 ๐๐๐ฑ๐ด๐ถ๐ป๐ด ๐๐ต๐ฒ ๐๐๐ฑ๐ด๐ฒ๐: ๐๐๐ฎ๐น๐๐ฎ๐๐ถ๐ป๐ด ๐๐น๐ถ๐ด๐ป๐บ๐ฒ๐ป๐ ๐ฎ๐ป๐ฑ ๐ฉ๐๐น๐ป๐ฒ๐ฟ๐ฎ๐ฏ๐ถ๐น๐ถ๐๐ถ๐ฒ๐ ๐ถ๐ป ๐๐๐ ๐-๐ฎ๐-๐๐๐ฑ๐ด๐ฒ๐ Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges (2406.12624)๐๐๐ง ๐๐๐๐ฌ ๐ฌ๐๐ซ๐ฏ๐ ๐๐ฌ ๐ซ๐๐ฅ๐ข๐๐๐ฅ๐ ๐ฃ๐ฎ๐๐ ๐๐ฌ โ๏ธ?We aim to identify the right metrics for evaluating Judge LLMs and understand their sensitivities to prompt guidelines, engineering, and specificity. With this paper, we want to raise caution โ ๏ธ to blindly using LLMs as human proxy.Blog - https://huggingface.co/blog/singh96aman/judgingthejudgesArxiv - https://arxiv.org/abs/2406.12624Tweet - https://x.com/iamsingh96aman/status/1804148173008703509 @singh96aman @kartik727 @Srinik-1 @sankaranv @dieuwkehupkes ๐ฅ 5 5 ๐ 3 3 + Reply