Explore tags
[IJCV] FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
Aplikasi ini digunakan untuk menghasilkan suara berbasis teks dengan berbagai pilihan pembicara. Teknologi yang digunakan meliputi model text-to-speech (TTS) yang canggih dengan konversi teks ke fonem. Model yang dipakai dilatih khusus untuk bahasa Indonesia, Jawa dan Sunda.
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
Bring portraits to life!
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
ComfyUI node for background removal, implementing InSPyreNet the best method up to date
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
A simple FastAPI Server to run XTTSv2
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
Stable Diffusion web UI UX: https://github.com/anapnoe/stable-diffusion-webui-ux
Check-ins4 check-ins
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation:https://github.com/Zejun-Yang/AniPortrait
Langflow is a dynamic graph where each node is an executable unit. Its modular and interactive design fosters rapid experimentation and prototyping, pushing hard on the limits of creativity: https://github.com/langflow-ai/langflow
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding/ https://github.com/Tencent/HunyuanDiT
Your image is almost there!:https://github.com/lllyasviel/Omost
Drag & drop UI to build your customized LLM flow: https://github.com/FlowiseAI/Flowise
[Need 24GB VRAM] Cambrian-1 is a family of multimodal LLMs with a vision-centric design: https://github.com/cambrian-mllm/cambrian
Dough is a open source tool for steering AI animations with precision
Official implementation of AnimateDiff.
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
Partial MPS support for ComfyUI nodes for LivePortrait to use them on a MacBook
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
moondream1 is a tiny (1.6B parameter) vision language model trained by @vikhyatk that performs on par with models twice its size. It is trained on the LLaVa training dataset, and initialized with SigLIP as the vision tower and Phi-1.5 as the text encoder. https://huggingface.co/spaces/vikhyatk/moondream1
Contribute to sukebenet/instruct-pix2pix development by creating an account on GitHub.
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
[ECCV 2024] Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
Contribute to andrewyng/translation-agent development by creating an account on GitHub.
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple