Pinokio
Explore tags
Umo
https://github.com/linus74rn/UmoPinokiov1.0updated 12/15/2025, 7:41:31 AMindexed 1/23/2026, 7:47:30 PM
Multi-Identity Consistency for Image Customization via Matching Reward https://github.com/bytedance/UMO
SillyTavern Character Generator
https://github.com/drago87/SillyTavern-Character-Generatorv4.0updated 12/14/2025, 4:03:35 PMindexed 1/23/2026, 7:47:36 PM
# SillyTavern Character Generator A pinokio script for https://github.com/Tremontaine/character-card-generator When used with KoboldCPP use http://localhost:5001/v1 Where 5001 is the port reported by KoboldCPP when starting Text API Key needs to be filled with anything. (If left empty will give a error so just add anything to it)
MoneyPrinterTurbo
https://github.com/harry0703/MoneyPrinterTurboupdated 12/14/2025, 4:02:56 AMindexed 1/29/2026, 7:16:30 PM
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
Resemble Enhance
https://github.com/sealad886/pinokio-resemble-enhancev2.0updated 12/13/2025, 11:46:10 PMindexed 1/23/2026, 7:45:44 PM
AI-powered speech denoising + enhancement (Gradio web demo + CLI).
GLM-TTS
https://github.com/PierrunoYT/GLM-TTS-Pinokiov1.0.0updated 12/13/2025, 8:56:58 AMindexed 1/23/2026, 7:44:58 PM
🎙️ Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning. High-quality text-to-speech synthesis supporting zero-shot voice cloning and streaming inference with natural emotional expression.
ai-toolkit
https://github.com/ai-anchorite/ai-toolkitv3.7updated 12/13/2025, 3:33:21 AMindexed 1/23/2026, 7:47:32 PM
AI Toolkit by Ostris
1 check-in
VoxCPM
https://github.com/Paxurux/Voxcpmv3.7updated 12/11/2025, 6:20:28 PMindexed 1/23/2026, 7:46:57 PM
Voice Synthesis Platform with Smart Chunking, Batch Processing, and Voice Cloning capabilities.
RVC
https://github.com/cocktailpeanut/rvc.pinokiov3.7updated 12/11/2025, 2:33:17 PMindexed 1/23/2026, 7:44:52 PM
1 Click Installer for Retrieval-based-Voice-Conversion-WebUI (https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI)
MLX Whisper WebUI
https://github.com/dadleo/mlx-whisper-webui-pinokiov1.1updated 12/10/2025, 7:33:38 PMindexed 1/23/2026, 7:47:35 PM
Fast Speech-to-Text Web UI with Apple MLX and OpenAI Whisper
VoxCPM-1.5
https://github.com/PierrunoYT/VoxCPM-1.5-Pinokiov1.0.0updated 12/9/2025, 5:03:11 PMindexed 1/23/2026, 7:47:32 PM
🎙️ Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning. Features 44.1kHz sampling rate, 6.25Hz token rate, and supports both SFT and LoRA fine-tuning. Built on MiniCPM-4 backbone for highly expressive, natural speech synthesis.
social-auto-upload
https://github.com/dreammis/social-auto-uploadupdated 12/9/2025, 7:01:04 AMindexed 1/29/2026, 1:36:28 PM
自动化上传视频到社交媒体:抖音、小红书、视频号、tiktok、youtube、bilibili
Dia
https://github.com/pinokiofactory/diav3.7updated 12/7/2025, 7:54:59 PMindexed 1/20/2026, 9:12:46 AM
Dia is a 1.6B parameter text to speech model created by Nari Labs. Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control. The model can also produce nonverbal communications like laughter, coughing, clearing throat, etc. https://github.com/nari-labs/dia
FramePack-Studio
https://github.com/FP-Studio/fp-studiov3.7updated 12/7/2025, 10:46:08 AMindexed 1/23/2026, 7:45:22 PM
[v0.5.1] FramePack Video App offering multiple generation types: Original, F1, video extension, end frame. Features include: LoRA support, job queueing, advanced timestamped prompts, offline mode, a post-processing suite including upscaling, interpolation, filters and more!
Ollama Web Interface
https://github.com/JL-Bones/Ollama_Webupdated 12/6/2025, 10:59:27 PMindexed 1/20/2026, 9:14:54 AM
A web interface for managing and interacting with Ollama models
zonos
https://github.com/pinokiofactory/zonosv3.7updated 12/6/2025, 10:44:22 PMindexed 1/20/2026, 9:11:12 AM
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS providers. https://github.com/Zyphra/Zonos
bolt.diy
https://github.com/pinokiofactory/boltv3.4.0updated 12/6/2025, 9:59:32 PMindexed 1/20/2026, 9:13:03 AM
Prompt, run, edit, and deploy full-stack web apps. https://github.com/stackblitz-labs/bolt.diy
1 check-in
Music Video Cutter
https://github.com/6Morpheus6/mvcv3.7updated 12/6/2025, 8:17:35 PMindexed 1/23/2026, 7:47:47 PM
Automatically create music videos. Synchronize the cuts to the music's beat.
1 check-in
ACE-Step
https://github.com/pinokiofactory/ACE-Stepv3.7updated 12/6/2025, 11:35:01 AMindexed 1/23/2026, 7:45:55 PM
A Step Towards Music Generation Foundation Model
echomimic2
https://github.com/pinokiofactory/echomimic2v3.7updated 12/6/2025, 5:47:56 AMindexed 1/20/2026, 9:14:28 AM
[NVIDIA ONLY] Make virtual avatars talk whatever you want with an image and an audio clip https://github.com/antgroup/echomimic_v2
1 check-in
PRX-1024 Text-to-Image
https://github.com/PierrunoYT/Photoroom-PRX-Pinokiov1.0.0updated 12/5/2025, 7:25:29 PMindexed 1/23/2026, 7:46:45 PM
Gradio web interface for Photoroom's PRX-1024-t2i-beta text-to-image model
PreviousPage 13 / 37Next