|
--- |
|
license: apache-2.0 |
|
--- |
|
# RWKV-x070-2B9-CJE-Instruct Model Card |
|
|
|
## Model Overview |
|
- **Model Name**: RWKV-x070-2B9-CJE-Instruct |
|
- **Description**: An instruction-tuned model specialized for Japanese, Chinese, and English languages |
|
- **Base Model**: rwkv-x070-2b9-world-v3-40%trained-20250113-ctx4k.pth |
|
- **Architecture**: RWKV x070 "Goose" |
|
- **Parameters**: 2.9B |
|
- **Model Dimension**: 2560 |
|
- **Number of Layers**: 32 |
|
|
|
## Fine-tuning Details |
|
|
|
### Training Configuration |
|
- **Trainer**: RWKV-LM-RLHF (https://github.com/OpenMOSE/RWKV-LM-RLHF) |
|
- **PEFT Mode**: Hybrid Training combining frozen embeddings and Bone (Block Affine Transformation) + full parameter training |
|
- **SFT Method**: SmoothingLoss SFT |
|
- **Context Window**: 5120 (trained with 1024 token overlap) |
|
- **Compute Power**: AMD Instinct MI100 x 2 60hrs (100% solar energy) |
|
|
|
### Dataset Specifications |
|
- **Size**: 800k pairs |
|
- **Content**: |
|
- Mixed data in Japanese, Chinese, and English |
|
- Conversations |
|
- Programming code |
|
- Translation tasks |
|
- Chain-of-Thought reasoning tasks |
|
|
|
### How to use |
|
- Install latest RWKV-Infer (Linux,WSL) (https://github.com/OpenMOSE/RWKV-Infer) |
|
- make folder 'models' |
|
- move rwkv-x070-2b9-cje-instruct-1.pth to models folder |
|
``` |
|
curl http://127.0.0.1:9000/loadmodel -X POST -H "Content-Type: application/json" -d '{"model_filename":"models/rwkv-x070-2b9-cje-instruct-1.pth","model_viewname":"RWKV x070 2B9 CJE Instruct-1","model_strategy":"fp16","endtoken":"\\n\\n\\x17"}' |
|
``` |
|
- Enjoy with openai compatible api http://127.0.0.1:9000/v1 :) |
|
|
|
### Important Note |
|
- Set the end token as '\n\n\x17' |
|
``` |
|
User: who are you?\n\n\x17 |
|
Assistant: gooday i'm rwkv\n\n\x17 |
|
``` |
|
|
|
### Limitations and Considerations |
|
- This is an experimental model; inference stability is not fully guaranteed |
|
- Unexpected behaviors may occur |
|
- Continuous improvements are being made; feedback is welcome |
|
|
|
## License |
|
Apache License 2.0 |
|
|
|
## Acknowledgments |
|
We express our gratitude to the RWKV base model and the RWKV community for their support in developing this model. |