--- title: README emoji: 🐠 colorFrom: purple colorTo: purple sdk: static pinned: false license: cc-by-4.0 --- # 🏢 Welcome to Itbanque **Itbanque** is dedicated to providing both high-quality fine-tuned models and structured datasets for AI, machine learning, and data-driven applications across various domains. --- ## 🧠 **Our Models** Itbanque fine-tunes open-source foundation models for domain-specific tasks, with a current focus on speech translation and transcription. We specialize in Whisper-based models adapted for accurate subtitle generation, especially for Japanese → Chinese translation. ### **Whisper-base-ja2zh** A Whisper base model fully fine-tuned for Japanese speech to Chinese text translation. - **BLEU Score** on Test Set: 0.72 - **Dataset**: ScreenTalk-JA2ZH --- ## 📊 **Our Datasets** We offer datasets with **structured, high-quality, and continuously updated** data, making them ideal for training AI models. ### 🔹 **ScreenTalk** A large-scale transcribed/translated speech dataset sourced from screen content, suitable for ASR and NLP tasks. - **XS Size** – Limited sample dataset. - **Full Size** – Full access + real-time updates. 👉 [Explore ScreenTalk Dataset](https://huggingface.co/datasets/DataLabX/ScreenTalk-XS) --- ## 🚀 **Why Choose DataLabX?** ✅ **High-quality, structured datasets** for AI training. ✅ **Regular updates** to ensure fresh, relevant data. ✅ **Different dataset sizes** to fit various user needs, from xs to full version. --- 💡 Support Our Work We are committed to providing high-quality datasets for AI research and development. Your support enables us to continue expanding and refining our datasets for better AI applications across multiple industries. 🔗 Donate & Support --- ## 📬 **Get in Touch** If you have any questions, need a custom dataset, or require enterprise licensing, feel free to reach out: 📧 **Contact:** [itbanque](mailto:contact@itbanque.com)