Explore tags
Realtime streaming TTS demo using microsoft/VibeVoice-Realtime-0.5B
Check-ins3 check-ins
Text/Image to 3D (Cross Platform: Mac + Windows + Linux): High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models. https://github.com/deepbeepmeep/Hunyuan3D-2GP
OpenAudioFeatured
Multilingual Text-to-Speech with Voice Cloning (Supports: English, Japanese, Korean, Chinese, French, German, Arabic, and Spanish) https://github.com/fishaudio/fish-speech
ForgeFeatured
[NVIDIA ONLY] The most efficient way to run FLUX (Optimized to run even on low memory machines, as low as 3GB VRAM with 512x512 resolution) https://github.com/lllyasviel/stable-diffusion-webui-forge
Check-ins11 check-ins
A Web UI for easy subtitle using whisper model.
ComfyuiFeatured
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface. https://github.com/comfyanonymous/ComfyUI
Check-ins27 check-ins
Platforms
GPUNVIDIAAMDApple
MagicQuillFeatured
An intelligent, interactive Image Editing System. Easily erase and add objects on a user-friendly interface.
Wan2GPFeatured
Super Optimized Gradio UI for AI video creation for GPU poor machines (6GB+ VRAM). Supports Wan 2.1/2.2, Qwen, Hunyuan Video, LTX Video and Flux. https://github.com/deepbeepmeep/Wan2GP
Check-ins52 check-ins
GPUNVIDIAAMDApple
AI Song Generation with Full Style Control - Generate complete songs with lyrics, vocals, and instrumental tracks using Tencent AI Lab's SongGeneration (LeVo) model. [NVIDIA ONLY]
Check-ins6 check-ins
GPUNVIDIAAMDApple