Store

Tag:#aix

https://github.com/pinokiofactory/Ultimate-TTS-Studiov3.7updated 1/5/2026, 10:06:28 PMindexed 1/6/2026, 6:18:09 AM

Kokoro, KittenTTS, Higgs audio, Chatterbox/Multi, Fish-Speech, F5 & index-tts & indextts2, VoxCPM and VibeVoice in one app

#ai

LivePortrait

https://github.com/pinokiofactory/liveportraitv3.7updated 1/5/2026, 2:02:09 AMindexed 1/6/2026, 6:17:05 AM

Bring portraits to life! https://github.com/KwaiVGI/LivePortrait

#ai #face

Forge

https://github.com/pinokiofactory/stable-diffusion-webui-forgev2.0updated 1/5/2026, 1:59:26 AMindexed 1/6/2026, 6:20:02 AM

[NVIDIA ONLY] The most efficient way to run FLUX (Optimized to run even on low memory machines, as low as 3GB VRAM with 512x512 resolution) https://github.com/lllyasviel/stable-diffusion-webui-forge

#ai #image

cogvideo

https://github.com/pinokiofactory/cogvideov3.7updated 1/4/2026, 1:51:01 AMindexed 1/6/2026, 6:19:13 AM

[NVIDIA ONLY] Generate videos with less than 10GB VRAM https://github.com/THUDM/CogVideo

#ai #gradio

CogStudio

https://github.com/pinokiofactory/cogstudiov3.7updated 1/4/2026, 12:47:22 AMindexed 1/6/2026, 6:17:11 AM

[NVIDIA ONLY] Advanced Web UI for CogVideo (text to video, image to video, video to video, extend video, etc) -- Generate videos with less than 10GB VRAM

#ai #gradio

OpenAudio

https://github.com/pinokiofactory/openaudiov3.7updated 1/3/2026, 1:47:14 PMindexed 1/6/2026, 6:16:52 AM

Multilingual Text-to-Speech with Voice Cloning (Supports: English, Japanese, Korean, Chinese, French, German, Arabic, and Spanish) https://github.com/fishaudio/fish-speech

#ai #audio

SongGeneration Studio

https://github.com/BazedFrog/SongGeneration-Studiov3.7updated 1/1/2026, 1:06:28 AMindexed 1/6/2026, 6:19:08 AM

AI Song Generation with Full Style Control - Generate complete songs with lyrics, vocals, and instrumental tracks using Tencent AI Lab's SongGeneration (LeVo) model. [NVIDIA ONLY]

#ai #music

Hunyuan3D-2-LowVRAM

https://github.com/pinokiofactory/Hunyuan3d-2-lowvramv3.7updated 12/27/2025, 8:44:51 PMindexed 1/6/2026, 6:19:05 AM

Text/Image to 3D (Cross Platform: Mac + Windows + Linux): High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models. https://github.com/deepbeepmeep/Hunyuan3D-2GP

#3d #ai

VibeVoice Realtime

https://github.com/pinokiofactory/vibevoice-realtimev5.0updated 12/22/2025, 10:00:08 PMindexed 1/6/2026, 6:18:30 AM

Realtime streaming TTS demo using microsoft/VibeVoice-Realtime-0.5B

#ai #gradio

e2-f5-tts

https://github.com/pinokiofactory/e2-f5-ttsv3.7updated 12/20/2025, 8:47:31 PMindexed 1/6/2026, 6:19:09 AM

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching https://huggingface.co/spaces/mrfakename/E2-F5-TTS

#ai #audio

Applio

https://github.com/pinokiofactory/appliov3.7updated 12/19/2025, 4:34:28 AMindexed 1/6/2026, 6:17:26 AM

A simple, high-quality voice conversion tool focused on ease of use and performance.

#ai

Whisper-WebUI

https://github.com/pinokiofactory/whisper-webuiv3.7updated 12/18/2025, 9:08:14 PMindexed 1/6/2026, 6:17:04 AM

A Web UI for easy subtitle using whisper model.

#ai #audio

FramePack

https://github.com/pinokiofactory/Frame-Packv3.7updated 12/18/2025, 10:04:18 AMindexed 1/6/2026, 6:15:18 AM

[NVIDIA ONLY] Generate Video Progressively. FramePack is a next-frame (next-frame-section) prediction neural network structure that generates videos progressively. https://github.com/lllyasviel/FramePack

#ai #video

MagicQuill

https://github.com/pinokiofactory/MagicQuillv3.7updated 12/17/2025, 5:51:58 AMindexed 1/6/2026, 6:18:31 AM

An intelligent, interactive Image Editing System. Easily erase and add objects on a user-friendly interface.

#ai #image

MFLUX-WEBUI

https://github.com/pinokiofactory/MFLUX-WEBUIv2.1updated 12/15/2025, 2:06:08 AMindexed 1/6/2026, 6:16:42 AM

[MAC ONLY] A powerful and user-friendly web interface for FLUX, powered by MLX and Gradio via MFLUX

#ai #flux

Dia

https://github.com/pinokiofactory/diav3.7updated 12/7/2025, 7:54:59 PMindexed 1/6/2026, 6:16:57 AM

Dia is a 1.6B parameter text to speech model created by Nari Labs. Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control. The model can also produce nonverbal communications like laughter, coughing, clearing throat, etc. https://github.com/nari-labs/dia

#ai #audio

zonos

https://github.com/pinokiofactory/zonosv3.7updated 12/6/2025, 10:44:22 PMindexed 1/6/2026, 6:14:46 AM

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS providers. https://github.com/Zyphra/Zonos

#ai #audio

bolt.diy

https://github.com/pinokiofactory/boltv3.4.0updated 12/6/2025, 9:59:32 PMindexed 1/6/2026, 6:17:33 AM

Prompt, run, edit, and deploy full-stack web apps. https://github.com/stackblitz-labs/bolt.diy

#ai #coding

DiffRhythm

https://github.com/pinokiofactory/diffrhythmv3.7updated 12/5/2025, 1:50:16 AMindexed 1/6/2026, 6:16:19 AM

Generate songs with AI (up to 4 min 45 sec). Both with lyrics or instrumental https://github.com/ASLP-lab/DiffRhythm

#ai #music

pyramidflow

https://github.com/pinokiofactory/pyramidflowv3.7updated 12/4/2025, 6:27:40 PMindexed 1/6/2026, 6:16:35 AM

Pyramd Flow Video Generation AI (text-to-video & image-to-video) https://github.com/jy0205/Pyramid-Flow

#ai #video