Explore tags
echomimic2Featured
[NVIDIA ONLY] Make virtual avatars talk whatever you want with an image and an audio clip https://github.com/antgroup/echomimic_v2
The ultimate video editor powered by natural language and FFMPEG https://huggingface.co/spaces/huggingface-projects/ai-video-composer
Check-ins2 check-ins
MMAudioFeatured
Generate synchronized audio from video and/or text inputs https://github.com/hkchengrex/MMAudio
Build your own voice for StyleTTS2
bolt.diyFeatured
Prompt, run, edit, and deploy full-stack web apps. https://github.com/stackblitz-labs/bolt.diy
Open WebUIFeatured
User-friendly WebUI for LLMs, supported LLM runners include Ollama and OpenAI-compatible APIs https://github.com/open-webui/open-webui
Check-ins7 check-ins
Platforms
zonosFeatured
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS providers. https://github.com/Zyphra/Zonos
MatAnyoneFeatured
MatAnyone AI is a tool for editing videos by separating objects from their backgrounds. It is an AI to remove the background from videos effectively. Stable Video Matting with Consistent Memory Propagation: https://github.com/pq-yang/MatAnyone.git
DiffRhythmFeatured
Generate songs with AI (up to 4 min 45 sec). Both with lyrics or instrumental https://github.com/ASLP-lab/DiffRhythm
HunyuanVideoFeatured
[NVIDIA ONLY] Super Optimized Gradio UI for Hunyuan Video Generator that works on GPU poor machines. Generate up to 10~14 sec videos https://github.com/deepbeepmeep/HunyuanVideoGP
FramePackFeatured
[NVIDIA ONLY] Generate Video Progressively. FramePack is a next-frame (next-frame-section) prediction neural network structure that generates videos progressively. https://github.com/lllyasviel/FramePack
Industry leading face manipulation platform
Check-ins34 check-ins
Platforms
GPUNVIDIAAMDApple
Kokoro, KittenTTS, Higgs audio, Chatterbox/Multi, Fish-Speech, F5 & index-tts & indextts2, VoxCPM and VibeVoice in one app
Check-ins13 check-ins
One-click 3D Gaussian Splatting generation from a single image.
Check-ins3 check-ins
Qwen3-TTSFeatured
Qwen3-TTS is an open-source series of TTS models developed by the Qwen team
Check-ins14 check-ins
Platforms
GPUNVIDIAAMDApple
e2-f5-ttsFeatured
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching https://huggingface.co/spaces/mrfakename/E2-F5-TTS
Text/Image to 3D (Cross Platform: Mac + Windows + Linux): High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models. https://github.com/deepbeepmeep/Hunyuan3D-2GP
OpenAudioFeatured
Multilingual Text-to-Speech with Voice Cloning (Supports: English, Japanese, Korean, Chinese, French, German, Arabic, and Spanish) https://github.com/fishaudio/fish-speech
ForgeFeatured
[NVIDIA ONLY] The most efficient way to run FLUX (Optimized to run even on low memory machines, as low as 3GB VRAM with 512x512 resolution) https://github.com/lllyasviel/stable-diffusion-webui-forge
Check-ins11 check-ins
A Web UI for easy subtitle using whisper model.