Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2407.09468

Papers - Visualizations - High Dimensional Approximations

Beyond Euclid: An Illustrated Guide to Modern Machine Learning with Geometric, Topological, and Algebraic Structures

Paper • 2407.09468 • Published Jul 12, 2024 • 1
A Heat Diffusion Perspective on Geodesic Preserving Dimensionality Reduction

Paper • 2305.19043 • Published May 30, 2023 • 1

Papers - Visualizations - Topological, Geometric, Algebraic

Beyond Euclid: An Illustrated Guide to Modern Machine Learning with Geometric, Topological, and Algebraic Structures

Paper • 2407.09468 • Published Jul 12, 2024 • 1

Papers - Visualizations - Report

Beyond Euclid: An Illustrated Guide to Modern Machine Learning with Geometric, Topological, and Algebraic Structures

Paper • 2407.09468 • Published Jul 12, 2024 • 1

Papers - Visualizations - Non-Euclidean Structures

Beyond Euclid: An Illustrated Guide to Modern Machine Learning with Geometric, Topological, and Algebraic Structures

Paper • 2407.09468 • Published Jul 12, 2024 • 1
Geodesic Multi-Modal Mixup for Robust Fine-Tuning

Paper • 2203.03897 • Published Mar 8, 2022 • 1

Papers - Training Research

Measuring the Effects of Data Parallelism on Neural Network Training

Paper • 1811.03600 • Published Nov 8, 2018 • 2
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost

Paper • 1804.04235 • Published Apr 11, 2018 • 2
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Paper • 1905.11946 • Published May 28, 2019 • 3
Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7, 2024 • 62

Papers - Attention

Linear Transformers with Learnable Kernel Functions are Better In-Context Models

Paper • 2402.10644 • Published Feb 16, 2024 • 80
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

Paper • 2305.13245 • Published May 22, 2023 • 5
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition

Paper • 2402.15220 • Published Feb 23, 2024 • 19
Sequence Parallelism: Long Sequence Training from System Perspective

Paper • 2105.13120 • Published May 26, 2021 • 5

Previous
1
...
6
7
8
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs