THUdyh
/

Oryx-1.5-7B

@@ -1,14 +1,16 @@
 ---
-license: apache-2.0
-datasets:
-- THUdyh/Oryx-SFT-Data
 base_model:
 - Qwen/Qwen2.5-7B-Instruct
-pipeline_tag: text-generation
 language:
 - en
 - zh
 ---
 # Oryx-1.5-7B
 ## Model Summary
@@ -18,6 +20,7 @@ The Oryx-1.5 models are 7/32B parameter models trained on [Oryx-SFT-Data](https:
 Oryx offers an on-demand solution to seamlessly and efficiently process visual inputs with arbitrary spatial sizes and temporal lengths.
 - **Repository:** https://github.com/Oryx-mllm/Oryx
 - **Languages:** English, Chinese
 - **Paper:** https://arxiv.org/abs/2409.12961
@@ -25,7 +28,7 @@ Oryx offers an on-demand solution to seamlessly and efficiently process visual i
 We provide a simple generation process for using our model. For more details, please refer to our [Github Repo](https://github.com/liuzuyan/oryx)
-```
 from oryx.model.builder import load_pretrained_model
 from oryx.mm_utils import get_model_name_from_path, process_images, tokenizer_image_token
 from oryx.constants import IMAGE_TOKEN_INDEX, DEFAULT_IMAGE_TOKEN, DEFAULT_IM_START_TOKEN, DEFAULT_IM_END_TOKEN, IGNORE_INDEX
@@ -125,4 +128,13 @@ print(text_outputs)
 - **Orchestration:** HuggingFace Trainer
 - **Code:** Pytorch
-## Citation

 ---
 base_model:
 - Qwen/Qwen2.5-7B-Instruct
+datasets:
+- THUdyh/Oryx-SFT-Data
 language:
 - en
 - zh
+license: apache-2.0
+pipeline_tag: video-text-to-text
+library_name: oryx
 ---
 # Oryx-1.5-7B
 ## Model Summary
 Oryx offers an on-demand solution to seamlessly and efficiently process visual inputs with arbitrary spatial sizes and temporal lengths.
 - **Repository:** https://github.com/Oryx-mllm/Oryx
+- **Project Page:** https://oryx-mllm.github.io
 - **Languages:** English, Chinese
 - **Paper:** https://arxiv.org/abs/2409.12961
 We provide a simple generation process for using our model. For more details, please refer to our [Github Repo](https://github.com/liuzuyan/oryx)
+```python
 from oryx.model.builder import load_pretrained_model
 from oryx.mm_utils import get_model_name_from_path, process_images, tokenizer_image_token
 from oryx.constants import IMAGE_TOKEN_INDEX, DEFAULT_IMAGE_TOKEN, DEFAULT_IM_START_TOKEN, DEFAULT_IM_END_TOKEN, IGNORE_INDEX
 - **Orchestration:** HuggingFace Trainer
 - **Code:** Pytorch
+## Citation
+```bibtex
+@article{liu2024oryx,
+title={Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution},
+author={Liu, Zuyan and Dong, Yuhao and Liu, Ziwei and Hu, Winston and Lu, Jiwen and Rao, Yongming},
+journal={arXiv preprint arXiv:2409.12961},
+year={2024}
+}
+```