Store

Tag:#gradiox

cogvideo

https://github.com/pinokiofactory/cogvideov3.7updated 1/4/2026, 1:51:01 AMindexed 1/6/2026, 6:19:13 AM

[NVIDIA ONLY] Generate videos with less than 10GB VRAM https://github.com/THUDM/CogVideo

#ai #gradio

CogStudio

https://github.com/pinokiofactory/cogstudiov3.7updated 1/4/2026, 12:47:22 AMindexed 1/6/2026, 6:17:11 AM

[NVIDIA ONLY] Advanced Web UI for CogVideo (text to video, image to video, video to video, extend video, etc) -- Generate videos with less than 10GB VRAM

#ai #gradio

OpenAudio

https://github.com/pinokiofactory/openaudiov3.7updated 1/3/2026, 1:47:14 PMindexed 1/6/2026, 6:16:52 AM

Multilingual Text-to-Speech with Voice Cloning (Supports: English, Japanese, Korean, Chinese, French, German, Arabic, and Spanish) https://github.com/fishaudio/fish-speech

#ai #audio

VibeVoice Realtime

https://github.com/pinokiofactory/vibevoice-realtimev5.0updated 12/22/2025, 10:00:08 PMindexed 1/6/2026, 6:18:30 AM

Realtime streaming TTS demo using microsoft/VibeVoice-Realtime-0.5B

#ai #gradio

Whisper-WebUI

https://github.com/pinokiofactory/whisper-webuiv3.7updated 12/18/2025, 9:08:14 PMindexed 1/6/2026, 6:17:04 AM

A Web UI for easy subtitle using whisper model.

#ai #audio

MFLUX-WEBUI

https://github.com/pinokiofactory/MFLUX-WEBUIv2.1updated 12/15/2025, 2:06:08 AMindexed 1/6/2026, 6:16:42 AM

[MAC ONLY] A powerful and user-friendly web interface for FLUX, powered by MLX and Gradio via MFLUX

#ai #flux

Dia

https://github.com/pinokiofactory/diav3.7updated 12/7/2025, 7:54:59 PMindexed 1/6/2026, 6:16:57 AM

Dia is a 1.6B parameter text to speech model created by Nari Labs. Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control. The model can also produce nonverbal communications like laughter, coughing, clearing throat, etc. https://github.com/nari-labs/dia

#ai #audio

Wan2GP

https://github.com/pinokiofactory/wanv3.7updated 12/4/2025, 5:35:10 PMindexed 1/6/2026, 6:19:26 AM

Super Optimized Gradio UI for AI video creation for GPU poor machines (6GB+ VRAM). Supports Wan 2.1/2.2, Qwen, Hunyuan Video, LTX Video and Flux. https://github.com/deepbeepmeep/Wan2GP

#ai #video