Explore tags
🎙️ Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning. High-quality text-to-speech synthesis supporting zero-shot voice cloning and streaming inference with natural emotional expression.
Advanced 3B parameter language model with Gradio web interface, GPU acceleration, and complete privacy
Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, with AMD GPU support via ROCm. Windows and Linux.
AI Voice Assistant — voice conversations, animated face, canvas, music generation, and more.
Pinokio launcher for Comfy LTX Desktop with GGUF and INT8 support.
Advanced text-to-speech with voice cloning, multi-speaker support, and background music generation using Higgs Audio V2
One-click ComfyUI + Torch + Python installer by Inteliweb AI. https://github.com/Comfy-Org
Owner@maoper
Check-ins13 check-ins
Platforms
AI-Powered Text-to-Speech with Voice Cloning using Chatterbox TTS and Gradio interface
Industry leading face manipulation platform
Check-ins66 check-ins
Platforms
GPUNVIDIAAMDApple
Standalone Text-to-Speech application using Orpheus TTS with Gradio interface
🎙️ Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning. Features 44.1kHz sampling rate, 6.25Hz token rate, and supports both SFT and LoRA fine-tuning. Built on MiniCPM-4 backbone for highly expressive, natural speech synthesis.
🗣️ PersonaPlex - NVIDIA's real-time speech-to-speech conversational AI model. Natural full-duplex conversations with customizable personas and voices. Requires an NVIDIA GPU on Windows or Linux (16-24GB VRAM recommended), 32GB RAM, and a Hugging Face account.
🌍 TranslateGemma - Google's open-source multilingual translation AI. Translate text across 55+ languages and extract/translate text from images. Powered by Gemma 3 architecture.
AI Song Generation with Full Style Control - Generate complete songs with lyrics, vocals, and instrumental tracks using Tencent AI Lab's SongGeneration (LeVo) model. [NVIDIA ONLY]
Check-ins8 check-ins
GPUNVIDIAAMDApple
Super Optimized Gradio UI for AI video creation for GPU poor machines (6GB+ VRAM). Supports Wan 2.1/2.2, Qwen, Hunyuan Video, LTX Video and Flux. https://github.com/deepbeepmeep/Wan2GP
Check-ins64 check-ins
Platforms
GPUNVIDIAAMDApple
All-in-one Gradio UI for the MOSS-TTS Family: voice cloning, dialogue generation, voice design from text, and sound effects.
Paste long text, clean it into readable sections, summarize each section, and ask questions in-browser with WebGPU.
Kimodo generates high-quality 3D human and robot motions and is controlled through text prompts
test to get this app working on pinokio