jvelja/ppo-ppo-gpt2-imdb-epoch-123123-epoch-123123 Reinforcement Learning • Updated Aug 12, 2024 • 12
jvelja/ppo-ppo-ppo-gpt2-imdb-epoch-123123-epoch-123123-epoch-123123123 Reinforcement Learning • Updated Aug 12, 2024 • 9
jvelja/ppo-self.llama-3-8b-Instruct_fullyUnseeded_MULTIBIT_0 Reinforcement Learning • Updated Aug 21, 2024