Wan2GP
https://github.com/pinokiofactory/wanv3.7updated 12/4/2025, 5:35:10 PMindexed 1/6/2026, 6:19:26 AM
Super Optimized Gradio UI for AI video creation for GPU poor machines (6GB+ VRAM). Supports Wan 2.1/2.2, Qwen, Hunyuan Video, LTX Video and Flux. https://github.com/deepbeepmeep/Wan2GP
Dia
https://github.com/pinokiofactory/diav3.7updated 12/7/2025, 7:54:59 PMindexed 1/6/2026, 6:16:57 AM
Dia is a 1.6B parameter text to speech model created by Nari Labs. Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control. The model can also produce nonverbal communications like laughter, coughing, clearing throat, etc. https://github.com/nari-labs/dia
MFLUX-WEBUI
https://github.com/pinokiofactory/MFLUX-WEBUIv2.1updated 12/15/2025, 2:06:08 AMindexed 1/6/2026, 6:16:42 AM
[MAC ONLY] A powerful and user-friendly web interface for FLUX, powered by MLX and Gradio via MFLUX
Whisper-WebUI
https://github.com/pinokiofactory/whisper-webuiv3.7updated 12/18/2025, 9:08:14 PMindexed 1/6/2026, 6:17:04 AM
A Web UI for easy subtitle using whisper model.
VibeVoice Realtime
https://github.com/pinokiofactory/vibevoice-realtimev5.0updated 12/22/2025, 10:00:08 PMindexed 1/6/2026, 6:18:30 AM
Realtime streaming TTS demo using microsoft/VibeVoice-Realtime-0.5B
OpenAudio
https://github.com/pinokiofactory/openaudiov3.7updated 1/3/2026, 1:47:14 PMindexed 1/6/2026, 6:16:52 AM
Multilingual Text-to-Speech with Voice Cloning (Supports: English, Japanese, Korean, Chinese, French, German, Arabic, and Spanish) https://github.com/fishaudio/fish-speech
cogvideo
https://github.com/pinokiofactory/cogvideov3.7updated 1/4/2026, 1:51:01 AMindexed 1/6/2026, 6:19:13 AM
[NVIDIA ONLY] Generate videos with less than 10GB VRAM https://github.com/THUDM/CogVideo
CogStudio
https://github.com/pinokiofactory/cogstudiov3.7updated 1/4/2026, 12:47:22 AMindexed 1/6/2026, 6:17:11 AM
[NVIDIA ONLY] Advanced Web UI for CogVideo (text to video, image to video, video to video, extend video, etc) -- Generate videos with less than 10GB VRAM