Explore tags
Tag manager and captioner for image datasets: https://github.com/jhc13/taggui
[NVIDIA Only] Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation https://github.com/fudan-generative-vision/hallo
Fast and High-Quality Zero-Shot voice clone Text-to-Speech with Flow Matching
A minimalist todo list with a lightweight JSON API and local storage.
Audio Transcription App with Parakeet-TDT-0.6b-v2
A FastAPI wrapper for KokoroTTS. Integrates with Open-WebUI and other API-driven AI applications.
Janus Pro 7B is a powerful multimodal AI model designed for advanced image understanding and text-to-image generation.
Agentic AI Software Engineer https://github.com/stitionai/devika
dreamtalkFeatured
When Expressive Talking Head Generation Meets Diffusion Probabilistic Models (https://github.com/ali-vilab/dreamtalk)
StyleAlignedFeatured
Style Aligned Image Generation via Shared Attention https://style-aligned-gen.github.io/
A Web UI for easy subtitle using whisper model (https://github.com/jhj0517/Whisper-WebUI)
Install AnimateDiff Automatic1111 Extension and the models with one click
A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.
Upload the picture of an image, and generate images with that image style. Instant generation with no LoRA required https://huggingface.co/spaces/InstantX/InstantStyle
Gradio web interface for Photoroom's PRX-1024-t2i-beta text-to-image model
Generate realistic and expressive speech with natural language voice design.