metadata
license: mit
easyGUI
easyGUI
is a user-friendly voice conversion framework based on VITS, designed to eliminate timbre leakage by replacing input features with those from the training set. It's efficient even on lower-end GPUs, requiring only about 10 minutes of low-noise speech data for good results. The framework features a simple web interface, supports A card and I card acceleration, and uses the advanced RMVPE algorithm for pitch extraction.
Installation
Prerequisites
- Python 3.8 or higher
Installation Steps
Install Pytorch:
pip install torch torchvision torchaudio
For Nvidia Ampere (RTX30xx):
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
Install Dependencies:
- For Nvidia GPUs:
pip install -r requirements.txt
- For AMD/Intel GPUs:
pip install -r requirements-dml.txt
- For AMD ROCm (Linux):
pip install -r requirements-amd.txt
- For Intel IPEX (Linux):
pip install -r requirements-ipex.txt
- For Nvidia GPUs:
Install Optional Dependencies (if needed):
sh ./run.sh # For MacOS
Additional Setup
- Download Assets:
Download necessary models and files using the scripts in the
tools
directory. - Install FFmpeg:
sudo apt install ffmpeg # Ubuntu/Debian brew install ffmpeg # MacOS
Usage
Start the WebUI:
python demo.py
If using Poetry:
poetry run python demo.py
Features
- Top1 retrieval to replace input features
- Fast training on less powerful GPUs
- Model merging to change timbre
- Advanced pitch extraction with RMVPE