ComfyUI-FunAudioLLM

Comfyui custom node for FunAudioLLM include CosyVoice and SenseVoice

Features

CosyVoice

CosyVoice Version: 2024-10-04
Support SFT,Zero-shot,Cross-lingual,Instruct
Support CosyVoice-300M-25Hz in zero-shot and cross-lingual
Support SFT's 25Hz(unoffical)
<details> <summary>Save and load speaker model in zero-shot</summary> <img src="./assets/SaveSpeakerModel.png" alt="zh-CN" /> <br> <img src="./assets/LoadSpeakerModel.png" alt="zh-CN" /> </details>

SenseVoice

SenseVoice Version: 2024-10-04
Support SenseVoice-Small
<details> <summary>Support Punctuation segment (need turn off use_fast_mode)</summary> <img src="./assets/SenseVoice.png" alt="zh-CN" /> <br> <img src="./assets/PuncSegment.png" alt="zh-CN" /> </details>

How use

apt update
apt install ffmpeg

## in ComfyUI/custom_nodes
git clone https://github.com/SpenserCai/ComfyUI-FunAudioLLM
cd ComfyUI-FunAudioLLM
pip install -r requirements.txt

Windows

In windows need use conda to install pynini

conda install -c conda-forge pynini=2.1.6
pip install -r requirements.txt

If your network is unstable, you can pre-download the model from the following sources and place it in the appropriate directory.

CosyVoice-300M -> ComfyUI/models/CosyVoice/CosyVoice-300M
CosyVoice-300M-25Hz -> ComfyUI/models/CosyVoice/CosyVoice-300M-25Hz
CosyVoice-300M-SFT -> ComfyUI/models/CosyVoice/CosyVoice-300M-SFT
CosyVoice-300M-SFT-25Hz -> ComfyUI/models/CosyVoice/CosyVoice-300M-SFT-25Hz
CosyVoice-300M-Instruct -> ComfyUI/models/CosyVoice/CosyVoice-300M-Instruct
SenseVoiceSmall -> ComfyUI/models/SenseVoice/SenseVoiceSmall

WorkFlow

hunyuan video Dynamic Lora

LivePortrait

flux redux image mixing