Store
Explore tags
[IJCV] FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
Aplikasi ini digunakan untuk menghasilkan suara berbasis teks dengan berbagai pilihan pembicara. Teknologi yang digunakan meliputi model text-to-speech (TTS) yang canggih dengan konversi teks ke fonem. Model yang dipakai dilatih khusus untuk bahasa Indonesia, Jawa dan Sunda.
Bring portraits to life!
ComfyUI node for background removal, implementing InSPyreNet the best method up to date
A simple FastAPI Server to run XTTSv2
Stable Diffusion web UI UX: https://github.com/anapnoe/stable-diffusion-webui-ux
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation:https://github.com/Zejun-Yang/AniPortrait
Langflow is a dynamic graph where each node is an executable unit. Its modular and interactive design fosters rapid experimentation and prototyping, pushing hard on the limits of creativity: https://github.com/langflow-ai/langflow
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding/ https://github.com/Tencent/HunyuanDiT
Your image is almost there!:https://github.com/lllyasviel/Omost
Drag & drop UI to build your customized LLM flow: https://github.com/FlowiseAI/Flowise
[Need 24GB VRAM] Cambrian-1 is a family of multimodal LLMs with a vision-centric design: https://github.com/cambrian-mllm/cambrian
Dough is a open source tool for steering AI animations with precision
Official implementation of AnimateDiff.
Partial MPS support for ComfyUI nodes for LivePortrait to use them on a MacBook
moondream1 is a tiny (1.6B parameter) vision language model trained by @vikhyatk that performs on par with models twice its size. It is trained on the LLaVa training dataset, and initialized with SigLIP as the vision tower and Phi-1.5 as the text encoder. https://huggingface.co/spaces/vikhyatk/moondream1
Contribute to sukebenet/instruct-pix2pix development by creating an account on GitHub.
[ECCV 2024] Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
Contribute to andrewyng/translation-agent development by creating an account on GitHub.
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
