Store
Related tags
Qwen3-TTSFeatured
Qwen3-TTS is an open-source series of TTS models developed by the Qwen team
A multi-voice AI audiobook generator built on Qwen3-TTS — annotate scripts with an LLM, assign unique voices to each character, per-line style instructions for delivery, clone voices from reference audio, design new voices from text descriptions, train custom voices with LoRA fine-tuning, and export to MP3 or Audacity multi-track projects
[NVIDIA ONLY] AllTalk-TTS is a unified UI for E5-TTS, XTTS, Vite TTS, Piper TTS, Parler TTS and RVC, based on CoquiTTS, including a finetune mode.
Ultimate-TTS-StudioFeatured
Kokoro, KittenTTS, Higgs audio, Chatterbox/Multi, Fish-Speech, F5 & index-tts & indextts2, VoxCPM and VibeVoice in one app
Wan2GPFeatured
Super Optimized Gradio UI for AI video creation for GPU poor machines (6GB+ VRAM). Supports Wan 2.1/2.2, Qwen, Hunyuan Video, LTX Video and Flux. https://github.com/deepbeepmeep/Wan2GP
ComfyuiFeatured
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface. https://github.com/comfyanonymous/ComfyUI
Qwen3-TTS MLX WebUI EnhancedFeatured
High-quality text-to-speech with Beautiful Web UI & API, optimized for Apple Silicon using MLX. Features include Custom Voice (preset speakers), Voice Design (natural language), and Voice Cloning. With enhanced features for saving custom voices and long-form / endless TTS streaming.
Local-first AI audiobook production with voice cloning and chapter repair tools. This is the easiest way to install locally, including an optional demo voice library so you can start exploring right away. Live demo: senigami.github.io/audiobook-studio
Contribute to ggroup-ai-lab/gwen-tts development by creating an account on GitHub.
All in one Gradio interface for chatterbox. Voice cloning from uploaded audio samples, automatic text processing for long content and real-time speech generation with configurable parameters. (Minimum Requirements 4GB VRAM / Recommended Requirements 8GB VRAM)
e2-f5-ttsFeatured
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching https://huggingface.co/spaces/mrfakename/E2-F5-TTS
Kokoro, KittenTTS, Higgs audio, Chatterbox/Multi, Fish-Speech, F5 & index-tts & indextts2, VoxCPM and VibeVoice in one app
Orpheus-TTS-FastAPIFeatured
Orpheus TTS is an open-source text-to-speech system built on the Llama-3b backbone. Orpheus demonstrates the emergent capabilities of using LLMs for speech synthesis https://github.com/canopyai/Orpheus-TTS
Official code for "Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis"
clone voices into different languages by using just a quick 3-second audio clip. (a local version of https://huggingface.co/spaces/coqui/xtts)
Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice cloning.
