new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

Jun 18

Submitted by

myownskyW7

MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs

·
11 authors

Submitted by

vaishaal

DataComp-LM: In search of the next generation of training sets for language models

·
59 authors

Submitted by

fwnlp

mDPO: Conditional Preference Optimization for Multimodal Large Language Models

·
7 authors

Submitted by

ktio

THEANINE: Revisiting Memory Management in Long-term Conversations with Timeline-augmented Response Generation

·
10 authors

Submitted by

Yiwen-ntu

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

·
12 authors

Submitted by

philschmid

How Do Large Language Models Acquire Factual Knowledge During Pretraining?

·
7 authors

Submitted by

chenjoya

VideoLLM-online: Online Video Large Language Model for Streaming Video

·
10 authors

Submitted by

yuzhaouoe

A Simple and Effective $L_2$ Norm-Based Strategy for KV Cache Compression

·
4 authors

Submitted by

zongzhuofan

Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models

·
5 authors

Submitted by

davanstrien

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens

·
14 authors

Submitted by

Sreyan88

GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

·
9 authors

Submitted by

andreamaduzzi

LLaNA: Large Language and NeRF Assistant

·
5 authors

Submitted by

kaiyuyue

From Pixels to Prose: A Large Dataset of Dense Image Captions

·
10 authors

Submitted by

jiannanx

Pandora: Towards General World Model with Natural Language Actions and Video States

·
13 authors

Submitted by

syqi

In-Context Editing: Learning Knowledge from Self-Induced Distributions

·
8 authors

Submitted by

wzhouad

WPO: Enhancing RLHF with Weighted Preference Optimization

·
8 authors

Submitted by

yuchenlin

WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

·
6 authors

Submitted by

akhaliq

L4GM: Large 4D Gaussian Reconstruction Model

·
11 authors

Submitted by

wenhu

Unifying Multimodal Retrieval via Document Screenshot Embedding

·
5 authors

Submitted by

jieyuz2

Task Me Anything

·
10 authors

Submitted by

rishab-partha

Vid3D: Synthesis of Dynamic 3D Scenes using 2D Video Diffusion

·
3 authors

Submitted by

jifanz

Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning

·
12 authors

Submitted by

ravid

Just How Flexible are Neural Networks in Practice?

·
6 authors

Submitted by

amanchadha

Evaluating Open Language Models Across Task Types, Application Domains, and Reasoning Types: An In-Depth Experimental Analysis

·
3 authors

Submitted by

billwat

HiddenTables & PyQTax: A Cooperative Game and Dataset For TableQA to Ensure Scale and Data Privacy Across a Myriad of Taxonomies

·
4 authors

Submitted by

davidbrandfonbrener

CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training

·
5 authors

Submitted by

Bachstelze

Breaking the Attention Bottleneck

·
1 authors

Submitted by

toshas

Consistency^2: Consistent and Fast 3D Painting with Latent Consistency Models

·
3 authors

Submitted by

luckeciano

Deep Bayesian Active Learning for Preference Modeling in Large Language Models

·
4 authors