|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
tags: |
|
- mistral |
|
- dpo |
|
- biology |
|
- education |
|
--- |
|
This model is trained on zephyr-7b-beta with FastChat (for SFT) and TRL (for DPO). The purpose is to develop a more capable educational chatbot that helps students learn biology. |
|
|
|
If you use this work, please cite: Pedagogical Alignment of Large Language Models https://arxiv.org/abs/2402.05000 |
|
``` |
|
@misc{sonkar2024pedagogical, |
|
title={Pedagogical Alignment of Large Language Models}, |
|
author={Shashank Sonkar and Kangqi Ni and Sapana Chaudhary and Richard G. Baraniuk}, |
|
year={2024}, |
|
eprint={2402.05000}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL} |
|
} |
|
``` |