Store
Explore tags
Contribute to yuhaolove/ChatTTS-WebUI development by creating an account on GitHub.
Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
NSFW Bypass with this modified version
Contribute to Vaibhavs10/insanely-fast-whisper development by creating an account on GitHub.
[CVPR 2024 Oral] Rethinking Inductive Biases for Surface Normal Estimation
Contribute to mb6611/audio2hero development by creating an account on GitHub.
an open vision-language model by Google. PaliGemma is designed as a versatile model for transfer to a wide range of vision-language tasks such as image and short video caption, visual question answering, text reading, object detection and object segmentation https://huggingface.co/spaces/google/paligemma
Deforum extension for AUTOMATIC1111's Stable Diffusion webui
TripoSR: Fast 3D Object Reconstruction from a Single Image
Brought to you by Cohee, RossAscends, and the SillyTavern community, SillyTavern is a local-install interface that allows you to interact with text generation AIs (LLMs) to chat and roleplay with custom characters.
Generative AI for Professional Creatives
fooocus but with pony diffusion (mainly for colab) - VHDsdk2/Fooocus-pony-diffusion-v6-xl
Customized ID Consistent for human: https://github.com/JackAILab/ConsistentID
Stable Diffusion web UI UX: https://github.com/anapnoe/stable-diffusion-webui-ux
Contribute to yohanshin/WHAM development by creating an account on GitHub.
A Gradio UI for XTTSv2 and RVC, allowing for real-time voice conversion.
Swift client for the fal.ai model APIs
🔊 Text-Prompted Generative Audio Model
one-click face swap
