PCM
Phased Consistency Model - generate high quality images with 2 steps https://huggingface.co/spaces/radames/Phased-Consistency-Model-PCM
moondream2
a tiny vision language model that kicks ass and runs anywhere https://github.com/vikhyat/moondream
ZeST
ZeST: Zero-Shot Material Transfer from a Single Image. Local port of https://huggingface.co/spaces/fffiloni/ZeST (Project: https://ttchengab.github.io/zest/)
StoryDiffusion Comics
create a story by generating consistent images https://github.com/HVision-NKU/StoryDiffusion
Dia
Dia is a 1.6B parameter text to speech model created by Nari Labs. Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control. The model can also produce nonverbal communications like laughter, coughing, clearing throat, etc. https://github.com/nari-labs/dia
AudioGradio
One click installer for AudioCraft MusicGen and AudioGen Gradio UI (Requires at least Pinokio v0.0.56)
Open WebUI
User-friendly WebUI for LLMs, supported LLM runners include Ollama and OpenAI-compatible APIs https://github.com/open-webui/open-webui
Forge Neo
[NVIDIA ONLY] Stable Diffusion WebUI Forge supporting Flux, Qwen, wan, nunchaku and more in a lightweight WebUI. https://github.com/Haoming02/sd-webui-forge-classic/tree/neo
Bark Voice Cloning
Upload a clean 20 seconds WAV file of the vocal persona you want to mimic, type your text-to-speech prompt and hit submit! A local version of https://huggingface.co/spaces/fffiloni/instant-TTS-Bark-cloning
Stable Diffusion web UI
One-click launcher for Stable Diffusion web UI (AUTOMATIC1111/stable-diffusion-webui)
CustomNet
A unified encoder-based framework for object customization in text-to-image diffusion models https://huggingface.co/spaces/TencentARC/CustomNet
