Store

Tag:#ttsx

https://github.com/Xerophayze/TTS-Storyv2.0updated 1/28/2026, 2:51:35 PMindexed 1/28/2026, 5:00:23 PM

Multi-Voice Text-to-Speech for Stories and Audiobooks. Supports Kokoro and Chatterbox TTS engines with GPU acceleration.

Wan2GP

https://github.com/pinokiofactory/wanv3.7updated 1/28/2026, 9:41:31 AMindexed 1/28/2026, 4:16:21 PM

Super Optimized Gradio UI for AI video creation for GPU poor machines (6GB+ VRAM). Supports Wan 2.1/2.2, Qwen, Hunyuan Video, LTX Video and Flux. https://github.com/deepbeepmeep/Wan2GP

#image #video #ai #image-generation #video-generation #gradio

Qwen3-TTS

https://github.com/SUP3RMASS1VE/Qwen3-TTS-Pinokiov5.0updated 1/27/2026, 5:41:21 PMindexed 1/27/2026, 6:02:41 PM

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team

#tts #ai

LuxTTS Studio

https://github.com/TheAwaken1/LuxTTS-Studiov2.0updated 1/26/2026, 7:53:06 PMindexed 1/26/2026, 11:39:48 PM

Gradio-based web interface for the LuxTTS voice cloning and text-to-speech model, enabling users to generate customized speech from text using uploaded or recorded audio references with adjustable parameters like speed, guidance scale, and inference steps.

Qwen3-TTS

https://github.com/Xeronal81/Qwen3-TTS-Pinokiov5.0updated 1/26/2026, 1:59:41 PMindexed 1/26/2026, 2:00:12 PM

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team

#tts

Orpheus-TTS-FastAPI

https://github.com/pinokiofactory/Orpheus-TTS-FastAPIv3.7updated 1/24/2026, 11:02:12 PMindexed 1/24/2026, 11:41:38 PM

Orpheus TTS is an open-source text-to-speech system built on the Llama-3b backbone. Orpheus demonstrates the emergent capabilities of using LLMs for speech synthesis https://github.com/canopyai/Orpheus-TTS

#ai #tts

LiquidAI-LFM2.5 Playground

https://github.com/TheAwaken1/LiquidAI-LFM2.5-Playgroundv2.0updated 1/24/2026, 3:15:55 PMindexed 1/24/2026, 3:16:01 PM

Local multimodal app powered by Liquid AI LFM2.5-Audio-1.5B and LFM2.5-VL-1.6B models, delivering real-time voice chat, text-to-speech synthesis, long-form audio transcription, and multi-image vision reasoning.

e2-f5-tts

https://github.com/pinokiofactory/e2-f5-ttsv3.7updated 1/23/2026, 9:14:27 PMindexed 1/27/2026, 1:36:05 PM

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching https://huggingface.co/spaces/mrfakename/E2-F5-TTS

#tts #ai

Chattered

https://github.com/6Morpheus6/Chatteredv3.7updated 1/22/2026, 2:40:08 AMindexed 1/23/2026, 7:44:38 PM

All in one Gradio interface for chatterbox. Voice cloning from uploaded audio samples, automatic text processing for long content and real-time speech generation with configurable parameters. (Minimum Requirements 4GB VRAM / Recommended Requirements 8GB VRAM)

Ultimate-TTS-Studio

https://github.com/pinokiofactory/Ultimate-TTS-Studiov3.7updated 1/21/2026, 4:57:23 PMindexed 1/25/2026, 9:28:47 PM

Kokoro, KittenTTS, Higgs audio, Chatterbox/Multi, Fish-Speech, F5 & index-tts & indextts2, VoxCPM and VibeVoice in one app

#ai #gradio #tts

Whisper-WebUI

https://github.com/pinokiofactory/whisper-webuiv3.7updated 1/20/2026, 11:36:49 PMindexed 1/23/2026, 7:45:51 PM

A Web UI for easy subtitle using whisper model.

#ai #gradio #tts #whisper

Comfyui

https://github.com/pinokiofactory/comfyv3.7updated 1/14/2026, 11:37:40 AMindexed 1/24/2026, 11:52:22 PM

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface. https://github.com/comfyanonymous/ComfyUI

#video #image #ai #audio #comfyui #node-interface

AllTalk-TTS v2

https://github.com/6Morpheus6/alltalk-ttsv3.3updated 1/5/2026, 6:52:06 PMindexed 1/23/2026, 7:45:17 PM

[NVIDIA ONLY] AllTalk-TTS is a unified UI for E5-TTS, XTTS, Vite TTS, Piper TTS, Parler TTS and RVC, based on CoquiTTS, including a finetune mode.

#tts

StyleTTS2 Studio

https://github.com/pinokiofactory/StyleTTS2_Studiov3.7updated 1/4/2026, 5:07:11 AMindexed 1/23/2026, 7:46:14 PM

Build your own voice for StyleTTS2

#ai #tts

OpenAudio

https://github.com/pinokiofactory/openaudiov3.7updated 1/3/2026, 1:47:18 PMindexed 1/27/2026, 9:08:41 AM

Multilingual Text-to-Speech with Voice Cloning (Supports: English, Japanese, Korean, Chinese, French, German, Arabic, and Spanish) https://github.com/fishaudio/fish-speech

#ai #audio #gradio #tts

VibeVoice Realtime

https://github.com/pinokiofactory/vibevoice-realtimev5.0updated 12/22/2025, 10:00:08 PMindexed 1/20/2026, 9:13:58 AM

Realtime streaming TTS demo using microsoft/VibeVoice-Realtime-0.5B

#ai #tts

Dia

https://github.com/pinokiofactory/diav3.7updated 12/7/2025, 7:54:59 PMindexed 1/20/2026, 9:12:46 AM

Dia is a 1.6B parameter text to speech model created by Nari Labs. Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control. The model can also produce nonverbal communications like laughter, coughing, clearing throat, etc. https://github.com/nari-labs/dia

#ai #tts

zonos

https://github.com/pinokiofactory/zonosv3.7updated 12/6/2025, 10:44:22 PMindexed 1/20/2026, 9:11:12 AM

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS providers. https://github.com/Zyphra/Zonos

#ai #tts

Bark Voice Cloning

https://github.com/6Morpheus6/barkv3.7updated 11/30/2025, 4:33:53 AMindexed 1/22/2026, 2:20:35 AM

Upload a clean 20 seconds WAV file of the vocal persona you want to mimic, type your text-to-speech prompt and hit submit! A local version of https://huggingface.co/spaces/fffiloni/instant-TTS-Bark-cloning

XTTS

https://github.com/cocktailpeanut/xtts.pinokiov3.0updated 11/10/2025, 4:28:58 AMindexed 1/23/2026, 7:47:04 PM

clone voices into different languages by using just a quick 3-second audio clip. (a local version of https://huggingface.co/spaces/coqui/xtts)

#ai #tts