ApatheticWithoutTheA
/

Hermes-3-Llama-3.2-3B-Reasoning

Trained with Unsloth

d006f5e verified about 1 month ago

57 Bytes

metadata

license: mit
tags:
  - unsloth
  - trl
  - grpo