Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2102.04664

Papers - Text - Bidirectional Encoders

BioBERT: a pre-trained biomedical language representation model for biomedical text mining

Paper • 1901.08746 • Published Jan 25, 2019 • 3
Pretraining-Based Natural Language Generation for Text Summarization

Paper • 1902.09243 • Published Feb 25, 2019 • 2
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 7
DeBERTa: Decoding-enhanced BERT with Disentangled Attention

Paper • 2006.03654 • Published Jun 5, 2020 • 3

Papers - Coding

CodeBERT: A Pre-Trained Model for Programming and Natural Languages

Paper • 2002.08155 • Published Feb 19, 2020 • 2
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

Paper • 2402.14658 • Published Feb 22, 2024 • 82
CodeFusion: A Pre-trained Diffusion Model for Code Generation

Paper • 2310.17680 • Published Oct 26, 2023 • 69
CodePlan: Repository-level Coding using LLMs and Planning

Paper • 2309.12499 • Published Sep 21, 2023 • 77

There's usually interesting papers in the model cards on the leaderboard: https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard

StarCoder: may the source be with you!

Paper • 2305.06161 • Published May 9, 2023 • 31
WizardCoder: Empowering Code Large Language Models with Evol-Instruct

Paper • 2306.08568 • Published Jun 14, 2023 • 28
SantaCoder: don't reach for the stars!

Paper • 2301.03988 • Published Jan 9, 2023 • 7
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25, 2024 • 60

Code Evaluation

Collection of Papers on Code Evaluation (from code generation language models)

A Survey on Language Models for Code

Paper • 2311.07989 • Published Nov 14, 2023 • 22
Evaluating Large Language Models Trained on Code

Paper • 2107.03374 • Published Jul 7, 2021 • 8
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

Paper • 2310.06770 • Published Oct 10, 2023 • 5
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

Paper • 2102.04664 • Published Feb 9, 2021 • 2

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs