Store
Related tags
MAGNeT
MAGNeT is a text-to-music and text-to-sound model capable of generating high-quality audio samples conditioned on text descriptions https://github.com/facebookresearch/audiocraft/blob/main/docs/MAGNET.md
InstantID
state-of-the-art tuning-free method to achieve ID-Preserving generation with only single image, supporting various downstream tasks. https://instantid.github.io/
PCM
Phased Consistency Model - generate high quality images with 2 steps https://huggingface.co/spaces/radames/Phased-Consistency-Model-PCM
BRIA RMBG
Background removal model developed by BRIA.AI, trained on a carefully selected dataset and is available as an open-source model for non-commercial use https://huggingface.co/spaces/briaai/BRIA-RMBG-1.4
[NVIDIA GPU ONLY] LGM
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation https://huggingface.co/spaces/ashawkey/LGM
MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean https://github.com/myshell-ai/MeloTTS
remove-video-bg
Video background removal tool https://huggingface.co/spaces/amirgame197/Remove-Video-Background
differential-diffusion-ui
Differential Diffusion modifies an image according to a text prompt, and according to a map that specifies the amount of change in each region https://differential-diffusion.github.io/
ZETA
Zero-Shot Text-Based Audio Editing Using DDPM Inversion https://huggingface.co/spaces/hilamanor/audioEditing
CustomNet
A unified encoder-based framework for object customization in text-to-image diffusion models https://huggingface.co/spaces/TencentARC/CustomNet
gligen
An intuitive GUI for GLIGEN that uses ComfyUI in the backend https://github.com/mut-ex/gligen-gui
face-to-all
diffusers InstantID + ControlNet inspired by face-to-many from fofr (https://x.com/fofrAI) - a localized Version of https://huggingface.co/spaces/multimodalart/face-to-all
instantstyle
Upload the picture of an image, and generate images with that image style. Instant generation with no LoRA required https://huggingface.co/spaces/InstantX/InstantStyle
parler-tts
a lightweight text-to-speech (TTS) model that can generate high-quality speech with features that can be controlled using a simple text prompt (e.g. gender, background noise, speaking rate, pitch and reverberation). https://huggingface.co/spaces/parler-tts/parler_tts_mini
Openvoice2
Openvoice 2 Web UI - A local web UI for Openvoice2, a multilingual voice cloning TTS https://x.com/myshell_ai/status/1783161876052066793
ZeST
ZeST: Zero-Shot Material Transfer from a Single Image. Local port of https://huggingface.co/spaces/fffiloni/ZeST (Project: https://ttchengab.github.io/zest/)
