moshi
[Mac only] a speech-text foundation model for real time dialogue https://github.com/kyutai-labs/moshi
ZETA
Zero-Shot Text-Based Audio Editing Using DDPM Inversion https://huggingface.co/spaces/hilamanor/audioEditing
VoxCPM
Voice Synthesis Platform with Smart Chunking, Batch Processing, and Voice Cloning capabilities.
Nemoml
[NVIDIA ONLY] A minimal Gradio interface for Automatic Speech Recognition. Transcribe Audio in Malayalam language.
video-background-removal
remove or change any video background https://huggingface.co/spaces/innova-ai/video-background-removal
AutoGen Studio
Declaratively define and modify agents and multi-agent workflows through a point and click, drag and drop interface (e.g., you can select the parameters of two agents that will communicate to solve your task).
OminiControl
A minimal and universal controller for FLUX.1 https://github.com/Yuanshi9815/OminiControl
Bagel
[NVIDIA ONLY] [RTX 50 Support] Image generation, image editing and free-form manipulation with a VLM (Minimum Requirements 12GB VRAM / 32GB RAM Recommended Requirements 24GB VRAM / 48GB RAM)
macOS-use
[Mac Only] We make AI agents that control Mac apps: https://github.com/browser-use/macOS-use
Clarity Refiners UI
An enhanced local port of finegrain-image-enhancer powered by Refiners (https://huggingface.co/spaces/finegrain/finegrain-image-enhancer), which was adapted from philz1337x's Clarity Upscaler (https://github.com/philz1337x/clarity-upscaler)
