University of Washington

Verified

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

fqjiang authored a paper 2 days ago

SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities

fqjiang authored a paper 22 days ago

Small Models Struggle to Learn from Strong Reasoners

CrystalMo updated a Space about 2 months ago

UW/Snake

View all activity

UW's activity

LNIU

authored a paper 22 days ago

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published 24 days ago • 28

CrystalMo

updated a Space about 2 months ago

Snake

Text Similarity Analysis

CrystalMo

published a Space about 2 months ago

Snake

Text Similarity Analysis

kellycyy

authored a paper 5 months ago

CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring the (Lack of) Cultural Knowledge of LLMs

Paper • 2410.02677 • Published Oct 3, 2024

jrfish

authored a paper 7 months ago

StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements

Paper • 2408.15666 • Published Aug 28, 2024 • 11

alisawuffles

authored a paper 8 months ago

Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?

Paper • 2407.16607 • Published Jul 23, 2024 • 23

juliakharchenko

authored a paper 9 months ago

How Well Do LLMs Represent Values Across Cultures? Empirical Analysis of LLM Responses Based on Hofstede Cultural Dimensions

Paper • 2406.14805 • Published Jun 21, 2024 • 3

LNIU

authored 3 papers 9 months ago

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 67

ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

Paper • 2402.11753 • Published Feb 19, 2024 • 6

SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding

Paper • 2402.08983 • Published Feb 14, 2024 • 4

alisawuffles

authored a paper about 1 year ago

Tuning Language Models by Proxy

Paper • 2401.08565 • Published Jan 16, 2024 • 23

alisawuffles

authored a paper over 1 year ago

Inverse Scaling: When Bigger Isn't Better

Paper • 2306.09479 • Published Jun 15, 2023 • 9

alisawuffles

authored a paper almost 2 years ago

How Language Model Hallucinations Can Snowball

Paper • 2305.13534 • Published May 22, 2023 • 3