easyGUI / README.md
Blane187's picture
Update README.md
1abe9e0 verified
|
raw
history blame
1.78 kB
metadata
license: mit

easyGUI

easyGUI is a user-friendly voice conversion framework based on VITS, designed to eliminate timbre leakage by replacing input features with those from the training set. It's efficient even on lower-end GPUs, requiring only about 10 minutes of low-noise speech data for good results. The framework features a simple web interface, supports A card and I card acceleration, and uses the advanced RMVPE algorithm for pitch extraction.

Installation

Prerequisites

  • Python 3.8 or higher

Installation Steps

  1. Install Pytorch:

    pip install torch torchvision torchaudio
    

    For Nvidia Ampere (RTX30xx):

    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
    
  2. Install Dependencies:

    • For Nvidia GPUs:
      pip install -r requirements.txt
      
    • For AMD/Intel GPUs:
      pip install -r requirements-dml.txt
      
    • For AMD ROCm (Linux):
      pip install -r requirements-amd.txt
      
    • For Intel IPEX (Linux):
      pip install -r requirements-ipex.txt
      
  3. Install Optional Dependencies (if needed):

    sh ./run.sh  # For MacOS
    

Additional Setup

  • Download Assets: Download necessary models and files using the scripts in the tools directory.
  • Install FFmpeg:
    sudo apt install ffmpeg  # Ubuntu/Debian
    brew install ffmpeg      # MacOS
    

Usage

Start the WebUI:

python demo.py

If using Poetry:

poetry run python demo.py

Features

  • Top1 retrieval to replace input features
  • Fast training on less powerful GPUs
  • Model merging to change timbre
  • Advanced pitch extraction with RMVPE