Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models Paper • 2501.13629 • Published Jan 23 • 44
view article Article How to build a custom text classifier without days of human labeling By sdiazlor and 4 others • Oct 17, 2024 • 55
Law of the Weakest Link: Cross Capabilities of Large Language Models Paper • 2409.19951 • Published Sep 30, 2024 • 54
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration Paper • 2410.02367 • Published Oct 3, 2024 • 48
RATIONALYST: Pre-training Process-Supervision for Improving Reasoning Paper • 2410.01044 • Published Oct 1, 2024 • 36
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment Paper • 2410.01679 • Published Oct 2, 2024 • 25
view article Article wHy DoNt YoU jUsT uSe ThE lLaMa ToKeNiZeR?? By catherinearnett • Sep 27, 2024 • 40
High-Resolution Image Synthesis with Latent Diffusion Models Paper • 2112.10752 • Published Dec 20, 2021 • 13
MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing Paper • 2306.10012 • Published Jun 16, 2023 • 35
AudioPaLM: A Large Language Model That Can Speak and Listen Paper • 2306.12925 • Published Jun 22, 2023 • 54