--- title: Mandarin Tone Evaluation emoji: 📉 colorFrom: gray colorTo: blue sdk: gradio sdk_version: 4.7.1 app_file: app.py pinned: false --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference # Team 3 Project - Tone Evaluation ## Overview Welcome to Team 3's Tone Evaluation project! This repository contains the necessary files and resources for our project, which focuses on data processing, training, testing, and a user interface (UI) demo. ## Project Structure - **Data Processing File**: [dataset.py](https://huggingface.co/spaces/CS5647Team3/Mandarin_Tone_Evaluation/blob/main/dataset.py) - This script is responsible for processing the raw data and preparing it for training and testing. - It takes input audio in wav format, and transfer audio into mel spectrum form and fundamental frequency form. These will be the two main features for the model to analyze. - We convert the pinyin and tone into numerical lables by providing a text file and link each pinyin to a index. - **Train File**: [train.py](https://huggingface.co/spaces/CS5647Team3/Mandarin_Tone_Evaluation/blob/main/train.py) - This file contains the code for training our tone evaluation model. We use CNN+CTC model for this task. - **Test File**: [test.py](https://huggingface.co/spaces/CS5647Team3/Mandarin_Tone_Evaluation/blob/main/test.py) - Use this script to evaluate the performance of our trained model on test data. - Currenty, we set the model to only accepct wav format audio, and after loading the audio, model will predict the tone sequence for the sentence. - **UI Demo**: [app.py](https://huggingface.co/spaces/CS5647Team3/Mandarin_Tone_Evaluation/blob/main/app.py) - Run this file to explore the user interface demo to interact with the tone evaluation model. - You can upload wav format audio to our UI and see the evaluation result. We also provided some audio files for you to directly use. ## Dataset We provide two versions of the dataset: - **Full Size Version**: Download from Kaggle [full_dataset](https://huggingface.co/datasets/CS5647Team3/full_dataset) - **Small Size Zip Version**: Zip file, Download from [data_mini](https://huggingface.co/datasets/CS5647Team3/data_mini) Additionally, we offer a text file for Pinyin encoding: [pinyin.txt](https://huggingface.co/spaces/CS5647Team3/Mandarin_Tone_Evaluation/blob/main/pinyin.txt). This file is crucial for understanding the encoding used in our dataset. ## Getting Started To directly view the UI demo, just go to our space and click "App" tab on the top right. Otherwise, follow these steps to get started with our project: 1. Clone this repository to your local machine. 2. Run the data processing script: `python data_processing.py` 3. Train the model using: `python train.py` 4. Evaluate the model with: `python test.py` 5. Explore the UI demo: `python ui_demo.py` ## Additional Information - If you encounter any issues or have questions, feel free to reach out to our team through emails. - Dataset and preprocessing - Shen Siyan shen_siyan@u.nus.edu - Ouyang Yanjia e0954791@u.nus.edu - Model Training - Zhao Zhengkai zhaozhengkai@u.nus.edu - Liu Mingxuan e0917087@u.nus.edu We hope you find our project useful and insightful! Happy coding!