AI & ML interests

LLM

Recent Activity

rulerman  updated a model 5 days ago
fnlp/MOSS-TTSD-v0.5
fdugyt  updated a model 6 days ago
fnlp/XY_Tokenizer_TTSD_V0_32k_hf
fdugyt  updated a model 6 days ago
fnlp/XY_Tokenizer_TTSD_V0_32k
View all activity

fnlp 's collections 6

MHA2MLA-refactor
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
MHA2MLA
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"