metadata

title: README
emoji: 🐠
colorFrom: purple
colorTo: purple
sdk: static
pinned: false
license: cc-by-4.0

🏢 Welcome to Itbanque

Itbanque is dedicated to providing both high-quality fine-tuned models and structured datasets for AI, machine learning, and data-driven applications across various domains.

🧠 Our Models

Itbanque fine-tunes open-source foundation models for domain-specific tasks, with a current focus on speech translation and transcription. We specialize in Whisper-based models adapted for accurate subtitle generation, especially for Japanese → Chinese translation.

Whisper-base-ja2zh

A Whisper base model fully fine-tuned for Japanese speech to Chinese text translation.

BLEU Score on Test Set: 0.72
Dataset: ScreenTalk-JA2ZH

📊 Our Datasets

We offer datasets with structured, high-quality, and continuously updated data, making them ideal for training AI models.

🔹 ScreenTalk

A large-scale transcribed/translated speech dataset sourced from screen content, suitable for ASR and NLP tasks.

XS Size – Limited sample dataset.
Full Size – Full access + real-time updates.

👉 Explore ScreenTalk Dataset

🚀 Why Choose DataLabX?

✅ High-quality, structured datasets for AI training.
✅ Regular updates to ensure fresh, relevant data.
✅ Different dataset sizes to fit various user needs, from xs to full version.

💡 Support Our Work We are committed to providing high-quality datasets for AI research and development. Your support enables us to continue expanding and refining our datasets for better AI applications across multiple industries.

🔗 Donate & Support

📬 Get in Touch

If you have any questions, need a custom dataset, or require enterprise licensing, feel free to reach out:

📧 Contact: fj11