SongGeneration Studio
https://github.com/BazedFrog/SongGeneration-Studiov3.7updated 1/1/2026, 1:06:28 AMindexed 1/6/2026, 6:19:08 AM
AI Song Generation with Full Style Control - Generate complete songs with lyrics, vocals, and instrumental tracks using Tencent AI Lab's SongGeneration (LeVo) model. [NVIDIA ONLY]
Sana
https://github.com/PierrunoYT/Sana-Pinokiov5.0updated 12/31/2025, 9:32:59 PMindexed 1/6/2026, 6:16:35 AM
Fast Image Generation with Sana Diffusion Model
Hunyuan3D-2-LowVRAM
https://github.com/pinokiofactory/Hunyuan3d-2-lowvramv3.7updated 12/27/2025, 8:44:51 PMindexed 1/6/2026, 6:19:05 AM
Text/Image to 3D (Cross Platform: Mac + Windows + Linux): High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models. https://github.com/deepbeepmeep/Hunyuan3D-2GP
Soprano TTS
https://github.com/PierrunoYT/soprano-tts-pinokiov5.0updated 12/27/2025, 9:49:34 AMindexed 1/6/2026, 6:18:55 AM
Instant, Ultra-Realistic Text-to-Speech
Sam3D
https://github.com/6Morpheus6/Sam3D-bodyv3.7updated 12/26/2025, 11:37:39 PMindexed 1/6/2026, 6:16:42 AM
Create 3D Meshes of Body Poses from Images.
FBCNN
https://github.com/KenjieDec/FBCNN-Pinokiov1.0updated 12/25/2025, 8:30:59 PMindexed 1/6/2026, 6:19:07 AM
Remove JPEG compression artifacts from images using FBCNN model
Miratts Pinokio
https://github.com/SUP3RMASS1VE/MiraTTS-Pinokiov4.0updated 12/24/2025, 1:15:47 AMindexed 1/6/2026, 6:18:15 AM
VibeVoice Realtime
https://github.com/pinokiofactory/vibevoice-realtimev5.0updated 12/22/2025, 10:00:08 PMindexed 1/6/2026, 6:18:30 AM
Realtime streaming TTS demo using microsoft/VibeVoice-Realtime-0.5B
IndexTTS-2
https://github.com/6Morpheus6/IndexTTS2v3.7updated 12/22/2025, 2:17:10 AMindexed 1/6/2026, 6:18:09 AM
Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech application
ChatterBox
https://github.com/PierrunoYT/chatterbox-tts-appv3.7updated 12/21/2025, 9:35:12 AMindexed 1/6/2026, 6:18:29 AM
e2-f5-tts
https://github.com/pinokiofactory/e2-f5-ttsv3.7updated 12/20/2025, 8:47:31 PMindexed 1/6/2026, 6:19:09 AM
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching https://huggingface.co/spaces/mrfakename/E2-F5-TTS
Applio
https://github.com/pinokiofactory/appliov3.7updated 12/19/2025, 4:34:28 AMindexed 1/6/2026, 6:17:26 AM
A simple, high-quality voice conversion tool focused on ease of use and performance.
Whisper-WebUI
https://github.com/pinokiofactory/whisper-webuiv3.7updated 12/18/2025, 9:08:14 PMindexed 1/6/2026, 6:17:04 AM
A Web UI for easy subtitle using whisper model.
chatterbox
https://github.com/Paxurux/chatterbox-old-supermasive-vrv3.7updated 12/18/2025, 3:50:20 PMindexed 1/6/2026, 6:16:35 AM
SoTA open-source TTS
FramePack
https://github.com/pinokiofactory/Frame-Packv3.7updated 12/18/2025, 10:04:18 AMindexed 1/6/2026, 6:15:18 AM
[NVIDIA ONLY] Generate Video Progressively. FramePack is a next-frame (next-frame-section) prediction neural network structure that generates videos progressively. https://github.com/lllyasviel/FramePack
Wan2GP - AMD
https://github.com/6Morpheus6/wan2gp-amdv3.7updated 12/17/2025, 7:45:09 PMindexed 1/6/2026, 6:15:24 AM
[AMD ONLY] Super Optimized Gradio UI for AI video creation for GPU poor machines (6GB+ VRAM). Supports Wan 2.1/2.2, Qwen, Hunyuan Video, LTX Video and Flux. (On Windows supported by 7900(XT), 7800(XT), 7600(XT), Phoenix, 9070(XT) and Strix Halo)
Moondream3 Gradio UI
https://github.com/PierrunoYT/moondream-3-pinokiov1.0.0updated 12/17/2025, 5:24:23 PMindexed 1/6/2026, 6:15:41 AM
A web interface for the Moondream3 vision-language model featuring image captioning, visual question answering, object detection, and object pointing.
MagicQuill
https://github.com/pinokiofactory/MagicQuillv3.7updated 12/17/2025, 5:51:58 AMindexed 1/6/2026, 6:18:31 AM
An intelligent, interactive Image Editing System. Easily erase and add objects on a user-friendly interface.
chatterbox
https://github.com/Blizaine/chatterbox-Turbov3.7updated 12/15/2025, 8:42:23 PMindexed 1/6/2026, 6:15:34 AM
Audio Flamingo 3
https://github.com/PierrunoYT/Audio-Flamingo-3-Pinokiov1.0.0updated 12/15/2025, 4:41:03 PMindexed 1/6/2026, 6:16:53 AM
NVIDIA's Audio Flamingo 3 - Large Audio-Language Model for speech, sound, and music understanding with Gradio web interface
PreviousPage 2 / 18Next