開源語音合成庫 coqui TTS 使用記錄

天地辽阔發表於2024-07-31

1 介紹

功能:可以克隆聲音;可以轉換聲音。支援多語言。

GitHub https://github.com/coqui-ai/TTS

線上試玩(效果不如本地demo) https://huggingface.co/spaces/coqui/xtts

2 本地搭建demo

搭建環境

conda create -n coqui python=3.10

conda activate coqui

pip install TTS (可以自動安裝需要的依賴包,也可以根據requirements.txt逐個安裝依賴包)

執行時其他缺的包直接pip即可(貌似就只有一個)

下載原始碼和模型

GitHub https://github.com/coqui-ai/TTS 版本dbf1a08

模型地址 https://huggingface.co/coqui/XTTS-v2/tree/main

測試指令碼

import torch
from TTS.api import TTS

## 檢視模型列表
# for name in TTS().list_models().list_models():
#     print(name)

## Init TTS 初始化,傳入模型和配置檔案路徑
device = "cuda" if torch.cuda.is_available() else "cpu"  # Get device
tts = TTS(model_path="/home/ze/coqui/mypath/models/model.pth", 
          config_path="/home/ze/coqui/mypath/models/config.json", 
          progress_bar=True).to(device)

## Text to speech to a file
# ## 英文
# tts.tts_to_file(text="A short story is a piece of prose fiction. It can typically be read in a single sitting and focuses on a self-contained incident or series of linked incidents, with the intent of evoking a single effect or mood.", 
#                 speaker_wav="mypath/audio/samples_en_sample.wav", 
#                 language="en", 
#                 file_path="output.wav")
# ## 中文
# tts.tts_to_file(text="龍能大能小,能升能隱;大則興雲吐霧,小則隱介藏形;升則飛騰於宇宙之間,隱則潛伏于波濤之內。方今春深,龍乘時變化,猶人得志而縱橫四海。", 
#                 speaker_wav="mypath/audio/samples_zh-cn-sample.wav", 
#                 language="zh-cn", 
#                 file_path="output.wav")
## 指定中文音色,輸出英文
tts.tts_to_file(text="A short story is a piece of prose fiction. It can typically be read in a single sitting and focuses on a self-contained incident or series of linked incidents, with the intent of evoking a single effect or mood.", 
                speaker_wav="mypath/audio/dragon.wav", 
                language="en", 
                file_path="output.wav")

遇到問題

報錯 NotADirectoryError: [Errno 20] Not a directory: '/home/ze/coqui/mypath/models/model.pth/model.pth’

原因:程式碼介面存在bug,在/home/ze/coqui/TTS-dev/TTS/utils/synthesizer.py line192載入模型時沒有按照介面定義。

解決:將home/ze/coqui/TTS-dev/TTS/utils/synthesizer.py line192呼叫語句self.tts_model.load_checkpoint()中引數tts_checkpoint改為模型所在路徑,比如"/home/ze/coqui/mypath/models”

相關文章