parler-tts
https://github.com/cocktailpeanutlabs/parler-ttsv1.5updated 3/17/2025, 1:07:45 AMindexed 1/6/2026, 6:19:03 AM
a lightweight text-to-speech (TTS) model that can generate high-quality speech with features that can be controlled using a simple text prompt (e.g. gender, background noise, speaking rate, pitch and reverberation). https://huggingface.co/spaces/parler-tts/parler_tts_mini
Openvoice2
https://github.com/cocktailpeanutlabs/openvoice2v3.0updated 3/17/2025, 12:46:07 AMindexed 1/6/2026, 6:16:43 AM
Openvoice 2 Web UI - A local web UI for Openvoice2, a multilingual voice cloning TTS https://x.com/myshell_ai/status/1783161876052066793
ZeST
https://github.com/cocktailpeanutlabs/zestv1.5updated 3/17/2025, 12:39:18 AMindexed 1/6/2026, 6:17:08 AM
ZeST: Zero-Shot Material Transfer from a Single Image. Local port of https://huggingface.co/spaces/fffiloni/ZeST (Project: https://ttchengab.github.io/zest/)
LlamaFactory
https://github.com/pinokiofactory/llamafactoryv1.5updated 3/17/2025, 12:35:47 AMindexed 1/6/2026, 6:19:10 AM
Unify Efficient Fine-Tuning of 100+ LLMs https://github.com/hiyouga/LLaMA-Factory
StableAudio
https://github.com/pinokiofactory/stableaudiov1.5updated 3/17/2025, 12:31:08 AMindexed 1/6/2026, 6:17:03 AM
An Open Source Model for Audio Samples and Sound Design https://github.com/Stability-AI/stable-audio-tools
flashdiffusion
https://github.com/pinokiofactory/flashdiffusionv1.5updated 3/17/2025, 12:27:30 AMindexed 1/6/2026, 6:16:44 AM
Accelerating any conditional diffusion model for few steps image generation https://gojasper.github.io/flash-diffusion-project/
RC Stable Audio Tools
https://github.com/pinokiofactory/rc-stableaudiov2.0updated 3/17/2025, 12:05:09 AMindexed 1/6/2026, 6:16:30 AM
Advanced Gradio UI for Stable Audio https://github.com/RoyalCities/RC-stable-audio-tools
audiocraft_plus
https://github.com/pinokiofactory/audiocraft_plusv2.0updated 3/17/2025, 12:02:34 AMindexed 1/6/2026, 6:17:55 AM
AudioCraft Plus is an all-in-one WebUI for the original AudioCraft, adding many quality features on top https://github.com/GrandaddyShmax/audiocraft_plus
flux-webui
https://github.com/pinokiofactory/flux-webuiv2.0updated 3/17/2025, 12:00:21 AMindexed 1/6/2026, 6:17:36 AM
Minimal Flux Web UI powered by Gradio & Diffusers (Flux Schnell + Flux Merged)
moshi
https://github.com/pinokiofactory/moshiv2.0updated 3/16/2025, 11:50:27 PMindexed 1/6/2026, 6:19:07 AM
[Mac only] a speech-text foundation model for real time dialogue https://github.com/kyutai-labs/moshi
devika
https://github.com/cocktailpeanutlabs/devikav3.0updated 3/8/2025, 7:33:17 PMindexed 1/6/2026, 6:17:08 AM
Agentic AI Software Engineer https://github.com/stitionai/devika
MagicAnimate
https://github.com/cocktailpeanut/MagicAnimate.pinokiov3.0updated 3/7/2025, 8:33:05 PMindexed 1/6/2026, 6:19:46 AM
[NVIDIA ONLY] Temporally Consistent Human Image Animation using Diffusion Model https://showlab.github.io/magicanimate/
Leffa
https://github.com/ai-anchorite/Leffav3.6updated 3/5/2025, 6:59:19 AMindexed 1/6/2026, 6:16:09 AM
UVR5-WebUI
https://github.com/SUP3RMASS1VE/UVR5-WebUIv2.0updated 3/1/2025, 10:10:13 PMindexed 1/6/2026, 6:19:49 AM
The best vocal remover application on the internet, and it's totally free and open source!
Deepseek-ai-Janus
https://github.com/SUP3RMASS1VE/Deepseek-ai-Janus-Pro-7Bv3.2updated 3/1/2025, 8:52:53 PMindexed 1/6/2026, 6:17:37 AM
Janus Pro 7B is a powerful multimodal AI model designed for advanced image understanding and text-to-image generation.
AudioSep
https://github.com/cocktailpeanut/AudioSep.pinokiov2.0updated 2/26/2025, 11:48:02 PMindexed 1/6/2026, 6:16:11 AM
Separate Anything You Describe (https://huggingface.co/spaces/Audio-AGI/AudioSep)
RVC-realtime
https://github.com/Feedjer/RVC-realtimev2.0updated 2/18/2025, 5:33:19 PMindexed 1/6/2026, 6:16:55 AM
[WINDOWS/LINUX ONLY] Easily train a good VC model with voice data <= 10 mins!: https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI
Applio
https://github.com/MDShoons/RVC-V3-v2.0updated 2/7/2025, 8:25:51 AMindexed 1/6/2026, 6:18:12 AM
A simple, high-quality voice conversion tool focused on ease of use and performance. https://github.com/IAHispano/Applio
Kokoro-TTS-Local v0.19
https://github.com/SUP3RMASS1VE/Kokoro-TTSv3.2updated 2/3/2025, 11:35:00 PMindexed 1/6/2026, 6:19:14 AM
A local implementation of the Kokoro Text-to-Speech model
fluxgym
https://github.com/stefantrajanov/fluxgym-trainerv2.1updated 1/28/2025, 3:50:27 AMindexed 1/6/2026, 6:17:28 AM
[NVIDIA Only] Dead simple web UI for training FLUX LoRA with LOW VRAM support (From 12GB)
PreviousPage 12 / 19Next