new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Aug 29

Submitted by

CodeGoat24

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

·
9 authors

Submitted by

lynazhang

rStar2-Agent: Agentic Reasoning Technical Report

·
15 authors

Submitted by

fenfan

USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning

·
8 authors

Submitted by

ztwang

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

·
11 authors

Submitted by

chengle

AWorld: Orchestrating the Training Recipe for Agentic AI

·
17 authors

Submitted by

primecai

Mixture of Contexts for Long Video Generation

·
13 authors

Submitted by

shujian2025

TCIA: A Task-Centric Instruction Augmentation Method for Instruction Finetuning

·
10 authors

3

Submitted by

m43

Multi-View 3D Point Tracking

·
10 authors

Submitted by

hammh0a

Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection

·
4 authors

Submitted by

XionghuiWang

OneReward: Unified Mask-Guided Image Generation via Multi-Task Human Preference Learning

·
6 authors

4

Submitted by

taesiri

CogVLA: Cognition-Aligned Vision-Language-Action Model via Instruction-Driven Routing & Sparsification

·
5 authors

Submitted by

Incomple

Persuasion Dynamics in LLMs: Investigating Robustness and Adaptability in Knowledge and Safety with DuET-PD

·
5 authors

Submitted by

fengyutong

ROSE: Remove Objects with Side Effects in Videos

·
10 authors

Submitted by

taesiri

FakeParts: a New Family of AI-Generated DeepFakes

·
7 authors

Submitted by

taesiri

Provable Benefits of In-Tool Learning for Large Language Models

·
4 authors

Submitted by

taesiri

Dress&Dance: Dress up and Dance as You Like It - Technical Preview

·
4 authors

Submitted by

Caoza

Collaborative Multi-Modal Coding for High-Quality 3D Generation

·
4 authors

Submitted by

taesiri

OnGoal: Tracking and Visualizing Conversational Goals in Multi-Turn Dialogue with Large Language Models

·
4 authors

Submitted by

HuBohy

Social-MAE: A Transformer-Based Multimodal Autoencoder for Face and Voice

·
5 authors