Kirill

hypothetical

https://thestage.ai

AI & ML interests

DNNs, differential geometry, algebraic topology, cryptography

Recent Activity

reacted to their post with 🚀 about 1 hour ago

We thought it would be easier, but finally we have integrated CuDNN Paged Attention to our models! Read article here: https://app.thestage.ai/blog/Integrating-cuDNN-Paged-Attention-to-TheStage-AI-Inference?id=8 Llama-8B with CuDNN paged attention, including B200 support: https://huggingface.co/TheStageAI/Elastic-Llama-3.1-8B-Instruct Mistral-Small-24B with CuDNN paged attention, including B200 support: https://huggingface.co/TheStageAI/Elastic-Mistral-Small-3.1-24B-Instruct-2503

posted an update about 1 hour ago

liked a Space about 2 hours ago

neuphonic/neutts-nano

View all activity

Organizations

Posts 3

Post

We thought it would be easier, but finally we have integrated CuDNN Paged Attention to our models!

Read article here: https://app.thestage.ai/blog/Integrating-cuDNN-Paged-Attention-to-TheStage-AI-Inference?id=8

Llama-8B with CuDNN paged attention, including B200 support: TheStageAI/Elastic-Llama-3.1-8B-Instruct
Mistral-Small-24B with CuDNN paged attention, including B200 support: TheStageAI/Elastic-Mistral-Small-3.1-24B-Instruct-2503

Post

1970

We have updated our transcription model: TheStageAI/thewhisper-large-v3-turbo

– 6.00 WER on the English Open ASR Leaderboard
– 4.74 WER on the Multilingual Open ASR Leaderboard
– Beats NVIDIA Parakeet (6.34 WER) and Whisper-large-v3-turbo (7.8 WER)
– Strong improvements in Arabic, Hindi, Chinese
– Maintains quality with background and environmental noise
– Optimized inference engines for NVIDIA and Apple
– Hugging Face Transformers interface for easy use
– Best-in-class speed on NVIDIA GPUs and power efficiency on Apple devices
– NVIDIA Jetson Thor support

View all Posts

models 0

None public yet

datasets 0

None public yet