Dual RL: Unification and New Methods for Reinforcement and Imitation Learning Paper • 2302.08560 • Published Feb 16, 2023 • 1
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces Paper • 2410.09918 • Published Oct 13, 2024 • 3
Online Intrinsic Rewards for Decision Making Agents from Large Language Model Feedback Paper • 2410.23022 • Published Oct 30, 2024
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning Paper • 2502.03275 • Published 9 days ago • 11
Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories Paper • 2210.06518 • Published Oct 12, 2022 • 1