1 介紹
功能:可以克隆聲音;可以轉換聲音。支援多語言。
GitHub https://github.com/coqui-ai/TTS
線上試玩(效果不如本地demo) https://huggingface.co/spaces/coqui/xtts
2 本地搭建demo
搭建環境
conda create -n coqui python=3.10
conda activate coqui
pip install TTS (可以自動安裝需要的依賴包,也可以根據requirements.txt逐個安裝依賴包)
執行時其他缺的包直接pip即可(貌似就只有一個)
下載原始碼和模型
GitHub https://github.com/coqui-ai/TTS 版本dbf1a08
模型地址 https://huggingface.co/coqui/XTTS-v2/tree/main
測試指令碼
import torch
from TTS.api import TTS
## 檢視模型列表
# for name in TTS().list_models().list_models():
# print(name)
## Init TTS 初始化,傳入模型和配置檔案路徑
device = "cuda" if torch.cuda.is_available() else "cpu" # Get device
tts = TTS(model_path="/home/ze/coqui/mypath/models/model.pth",
config_path="/home/ze/coqui/mypath/models/config.json",
progress_bar=True).to(device)
## Text to speech to a file
# ## 英文
# tts.tts_to_file(text="A short story is a piece of prose fiction. It can typically be read in a single sitting and focuses on a self-contained incident or series of linked incidents, with the intent of evoking a single effect or mood.",
# speaker_wav="mypath/audio/samples_en_sample.wav",
# language="en",
# file_path="output.wav")
# ## 中文
# tts.tts_to_file(text="龍能大能小,能升能隱;大則興雲吐霧,小則隱介藏形;升則飛騰於宇宙之間,隱則潛伏于波濤之內。方今春深,龍乘時變化,猶人得志而縱橫四海。",
# speaker_wav="mypath/audio/samples_zh-cn-sample.wav",
# language="zh-cn",
# file_path="output.wav")
## 指定中文音色,輸出英文
tts.tts_to_file(text="A short story is a piece of prose fiction. It can typically be read in a single sitting and focuses on a self-contained incident or series of linked incidents, with the intent of evoking a single effect or mood.",
speaker_wav="mypath/audio/dragon.wav",
language="en",
file_path="output.wav")
遇到問題
報錯 NotADirectoryError: [Errno 20] Not a directory: '/home/ze/coqui/mypath/models/model.pth/model.pth’
原因:程式碼介面存在bug,在/home/ze/coqui/TTS-dev/TTS/utils/synthesizer.py line192載入模型時沒有按照介面定義。
解決:將home/ze/coqui/TTS-dev/TTS/utils/synthesizer.py line192呼叫語句self.tts_model.load_checkpoint()中引數tts_checkpoint改為模型所在路徑,比如"/home/ze/coqui/mypath/models”