13 12 8

ct2

ct-2

AI & ML interests

None yet

Recent Activity

liked a model about 7 hours ago

VPTQ-community/deepseek-r1_v_8_k_65536_mixed_mp4

upvoted a paper 13 days ago

MutaGReP: Execution-Free Repository-Grounded Plan Search for Code-Use

upvoted a paper 13 days ago

Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam

View all activity

Organizations

None yet

ct-2's activity

upvoted 2 papers 13 days ago

MutaGReP: Execution-Free Repository-Grounded Plan Search for Code-Use

Paper • 2502.15872 • Published 20 days ago • 4

Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam

Paper • 2502.17055 • Published 17 days ago • 16

upvoted a collection 13 days ago

Slam

Collection

All resources for SpeechLMs from "Slamming: Training a Speech Language Model on One GPU in a Day". We provide tokeniser, lm, and datasets • 6 items • Updated 16 days ago • 13

upvoted a paper 13 days ago

Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

Paper • 2502.19261 • Published 15 days ago • 6

upvoted a collection 13 days ago

Drop-Upcycling

Collection

29 items • Updated about 17 hours ago • 2

upvoted a paper 20 days ago

Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?

Paper • 2502.11895 • Published 24 days ago • 1

upvoted a collection 27 days ago

Hamanasu

Collection

A brand new series of Models from yours truly, Designed for Intelligence, Creativity and Roleplay. • 16 items • Updated 2 days ago • 5

upvoted 2 papers 30 days ago

ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization

Paper • 2502.02631 • Published Feb 4 • 2

Unlocking Efficient Large Inference Models: One-Bit Unrolling Tips the Scales

Paper • 2502.01908 • Published Feb 4 • 1

upvoted a paper about 1 month ago

QuEST: Stable Training of LLMs with 1-Bit Weights and Activations

Paper • 2502.05003 • Published Feb 7 • 43

upvoted 2 papers 5 months ago

Why Does the Effective Context Length of LLMs Fall Short?

Paper • 2410.18745 • Published Oct 24, 2024 • 18

Unbounded: A Generative Infinite Game of Character Life Simulation

Paper • 2410.18975 • Published Oct 24, 2024 • 37