-
The LLM Surgeon
Paper • 2312.17244 • Published • 9 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69 -
Patchscope: A Unifying Framework for Inspecting Hidden Representations of Language Models
Paper • 2401.06102 • Published • 22 -
Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing
Paper • 2407.08770 • Published • 20
Collections
Discover the best community collections!
Collections including paper arxiv:2401.05561
-
YAYI 2: Multilingual Open-Source Large Language Models
Paper • 2312.14862 • Published • 14 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 57 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 48
-
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69 -
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Paper • 2303.12712 • Published • 2 -
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper • 2401.06080 • Published • 28 -
Measuring Implicit Bias in Explicitly Unbiased Large Language Models
Paper • 2402.04105 • Published • 1
-
Holistic Evaluation of Text-To-Image Models
Paper • 2311.04287 • Published • 12 -
MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks
Paper • 2311.07463 • Published • 14 -
Trusted Source Alignment in Large Language Models
Paper • 2311.06697 • Published • 11 -
DiLoCo: Distributed Low-Communication Training of Language Models
Paper • 2311.08105 • Published • 15
-
A survey on Kornia: an Open Source Differentiable Computer Vision Library for PyTorch
Paper • 2009.10521 • Published • 1 -
Kornia: an Open Source Differentiable Computer Vision Library for PyTorch
Paper • 1910.02190 • Published • 1 -
Learning Symmetrization for Equivariance with Orbit Distance Minimization
Paper • 2311.07143 • Published • 1 -
GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting
Paper • 2311.11700 • Published • 4
-
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 118 -
stabilityai/stable-video-diffusion-img2vid-xt
Image-to-Video • Updated • 393k • 2.88k -
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes
Paper • 2311.13384 • Published • 51 -
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Paper • 2311.12454 • Published • 30
-
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Paper • 2311.03285 • Published • 29 -
Tailoring Self-Rationalizers with Multi-Reward Distillation
Paper • 2311.02805 • Published • 4 -
Ultra-Long Sequence Distributed Transformer
Paper • 2311.02382 • Published • 3 -
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
Paper • 2309.11235 • Published • 15