Store
Explore tags
A simple (and weird) graphical user interface for Xintao's Real-ESRGAN AI.
enhance the resolution and spatiotemporal continuity of text-generated videos and image-generated videos
User-friendly WebUI for LLMs (Formerly Ollama WebUI)
LLM Web UI and API
A Real-Time Text-to-Image Generation Model
A simple aesthetic scorer + pruner + website you can run to view the results from the scoring with
Prompts Browser Extension for the AUTOMATIC1111/stable-diffusion-webui client
https://dl.acm.org/doi/10.1145/3576915.3623209
Contribute to yuhaolove/ChatTTS-WebUI development by creating an account on GitHub.
Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
Character Animation (AnimateAnyone, Face Reenactment)
NSFW Bypass with this modified version
Contribute to Vaibhavs10/insanely-fast-whisper development by creating an account on GitHub.
[CVPR 2024 Oral] Rethinking Inductive Biases for Surface Normal Estimation
Contribute to mb6611/audio2hero development by creating an account on GitHub.
an open vision-language model by Google. PaliGemma is designed as a versatile model for transfer to a wide range of vision-language tasks such as image and short video caption, visual question answering, text reading, object detection and object segmentation https://huggingface.co/spaces/google/paligemma
Deforum extension for AUTOMATIC1111's Stable Diffusion webui
TripoSR: Fast 3D Object Reconstruction from a Single Image
Brought to you by Cohee, RossAscends, and the SillyTavern community, SillyTavern is a local-install interface that allows you to interact with text generation AIs (LLMs) to chat and roleplay with custom characters.
Generative AI for Professional Creatives
