OpenMOSE's picture
Update README.md
41d5365 verified
---
license: apache-2.0
---
# RWKV-x070-2B9-CJE-Instruct Model Card
## Model Overview
- **Model Name**: RWKV-x070-2B9-CJE-Instruct
- **Description**: An instruction-tuned model specialized for Japanese, Chinese, and English languages
- **Base Model**: rwkv-x070-2b9-world-v3-40%trained-20250113-ctx4k.pth
- **Architecture**: RWKV x070 "Goose"
- **Parameters**: 2.9B
- **Model Dimension**: 2560
- **Number of Layers**: 32
## Fine-tuning Details
### Training Configuration
- **Trainer**: RWKV-LM-RLHF (https://github.com/OpenMOSE/RWKV-LM-RLHF)
- **PEFT Mode**: Hybrid Training combining frozen embeddings and Bone (Block Affine Transformation) + full parameter training
- **SFT Method**: SmoothingLoss SFT
- **Context Window**: 5120 (trained with 1024 token overlap)
- **Compute Power**: AMD Instinct MI100 x 2 60hrs (100% solar energy)
### Dataset Specifications
- **Size**: 800k pairs
- **Content**:
- Mixed data in Japanese, Chinese, and English
- Conversations
- Programming code
- Translation tasks
- Chain-of-Thought reasoning tasks
### How to use
- Install latest RWKV-Infer (Linux,WSL) (https://github.com/OpenMOSE/RWKV-Infer)
- make folder 'models'
- move rwkv-x070-2b9-cje-instruct-1.pth to models folder
```
curl http://127.0.0.1:9000/loadmodel -X POST -H "Content-Type: application/json" -d '{"model_filename":"models/rwkv-x070-2b9-cje-instruct-1.pth","model_viewname":"RWKV x070 2B9 CJE Instruct-1","model_strategy":"fp16","endtoken":"\\n\\n\\x17"}'
```
- Enjoy with openai compatible api http://127.0.0.1:9000/v1 :)
### Important Note
- Set the end token as '\n\n\x17'
```
User: who are you?\n\n\x17
Assistant: gooday i'm rwkv\n\n\x17
```
### Limitations and Considerations
- This is an experimental model; inference stability is not fully guaranteed
- Unexpected behaviors may occur
- Continuous improvements are being made; feedback is welcome
## License
Apache License 2.0
## Acknowledgments
We express our gratitude to the RWKV base model and the RWKV community for their support in developing this model.