Store
Explore tags
DeOldify for Stable Diffusion WebUI:This is an extension for StableDiffusion's AUTOMATIC1111 web-ui that allows colorize of old photos and old video. It is based on deoldify.
The script utilizes various deep learning models to create detailed character cards, including names, summaries, personalities, greeting messages, and character avatars.
[AAAI2024] FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning
Next generation face swapper and enhancer
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Simplify code execution with Open Interpreter UI Project with Streamlit. A user-friendly GUI for Python, JavaScript, and more. Pay-as-you-go, no subscriptions. Ideal for beginners. - blazzbyte/Open...
llama.cpp with BakLLaVA model describes what does it see (https://github.com/Fuzzy-Search/realtime-bakllava)
Contribute to cocktailpeanut/stable-diffusion-webui-forge development by creating an account on GitHub.
Segment Anything for Stable Diffusion WebUI
Code repository for Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.
Stable Diffusion UI with patches by lllyasviel
Flexible Automapper for Beatsaber made for any difficulty
A Streamlit app that uses Google's AI to summarize YouTube video transcripts, providing concise, point-form notes. Perfect for quick content overviews.
Fault-tolerant, highly scalable GPU orchestration, and a machine learning framework designed for training models with billions to trillions of parameters
Contribute to coqui-ai/xtts-streaming-server development by creating an account on GitHub.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
A Web UI for easy subtitle using whisper model.
moondream1 is a tiny (1.6B parameter) vision language model trained by @vikhyatk that performs on par with models twice its size. It is trained on the LLaVa training dataset, and initialized with SigLIP as the vision tower and Phi-1.5 as the text encoder. https://huggingface.co/spaces/vikhyatk/moondream1
nsfw protection bypass for the Next generation face swapper and enhancer
