--- license: mit --- # easyGUI `easyGUI` is a user-friendly voice conversion framework based on VITS, designed to eliminate timbre leakage by replacing input features with those from the training set. It's efficient even on lower-end GPUs, requiring only about 10 minutes of low-noise speech data for good results. The framework features a simple web interface, supports A card and I card acceleration, and uses the advanced RMVPE algorithm for pitch extraction. ## Installation ### Prerequisites - Python 3.8 or higher ### Installation Steps 1. **Install Pytorch**: ```bash pip install torch torchvision torchaudio ``` For Nvidia Ampere (RTX30xx): ```bash pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117 ``` 2. **Install Dependencies**: - For Nvidia GPUs: ```bash pip install -r requirements.txt ``` - For AMD/Intel GPUs: ```bash pip install -r requirements-dml.txt ``` - For AMD ROCm (Linux): ```bash pip install -r requirements-amd.txt ``` - For Intel IPEX (Linux): ```bash pip install -r requirements-ipex.txt ``` 3. **Install Optional Dependencies** (if needed): ```bash sh ./run.sh # For MacOS ``` ### Additional Setup - **Download Assets**: Download necessary models and files using the scripts in the `tools` directory. - **Install FFmpeg**: ```bash sudo apt install ffmpeg # Ubuntu/Debian brew install ffmpeg # MacOS ``` ## Usage Start the WebUI: ```bash python demo.py ``` If using Poetry: ```bash poetry run python demo.py ``` ## Features - Top1 retrieval to replace input features - Fast training on less powerful GPUs - Model merging to change timbre - Advanced pitch extraction with RMVPE ---