Project updates

More
cocktailpeanut/VALL-E-X.pinokioupdated 1y ago
An open source implementation of Microsoft's VALL-E X zero-shot TTS model
@cocktailpeanut0 check-insNVIDIAAMDApple
cocktailpeanut/realtime-lcm.pinokioupdated 1y ago
Demo showcasing ~real-time Latent Consistency Model pipeline with Diffusers and a MJPEG stream server (https://github.com/radames/Real-Time-Latent-Consistency-Model)
@cocktailpeanut0 check-insNVIDIAAMDApple
cocktailpeanut/diffusers-sdxl-turboupdated 1y ago
Demo showcasing ~real-time Latent Consistency Model pipeline with Diffusers and a MJPEG stream server (https://github.com/radames/Real-Time-Latent-Consistency-Model)
@cocktailpeanut1 check-inNVIDIAAMDApple
shadowburn0/lavie.pinokioupdated 1y ago
Text-to-Video (T2V) generation framework from Vchitect https://github.com/Vchitect/LaVie
0 check-insNVIDIAAMDApple
cocktailpeanut/mirrorupdated 1y ago
An AI powered mirror
@cocktailpeanut0 check-insNVIDIAAMDApple
cocktailpeanutlabs/deusupdated 1y ago
A Realtime Creation Engine
0 check-insNVIDIAAMDApple
nandometzger/MLFocalLengthsupdated 1y ago
Estimating the Focal Length of a Monocular Image
0 check-insNVIDIAAMDApple
pinokiofactory/florence-samv2.0updated 1y ago
Integrates Florence2 and SAM2 models for detailed image captioning and object detection. Florence2 generates detailed captions that are then used to perform phrase grounding. The Segment Anything Model 2 (SAM2) converts these phrase-grounded boxes into masks. https://huggingface.co/spaces/SkalskiP/florence-sam
1 check-inNVIDIAAMDApple
pinokiofactory/accdiffusionv2.0updated 1y ago
0 check-insNVIDIAAMDApple
lllyasviel/forge-legacy-extensionsupdated 1y ago
some archived legacy forge extensions
0 check-insNVIDIAAMDApple
open-mmlab/FoleyCrafterupdated 1y ago
[IJCV] FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
0 check-insNVIDIAAMDApple
drat/TTS-Indonesia-Gratisupdated 1y ago
Aplikasi ini digunakan untuk menghasilkan suara berbasis teks dengan berbagai pilihan pembicara. Teknologi yang digunakan meliputi model text-to-speech (TTS) yang canggih dengan konversi teks ke fonem. Model yang dipakai dilatih khusus untuk bahasa Indonesia, Jawa dan Sunda.
0 check-insNVIDIAAMDApple
cocktailpeanut/LivePortraitupdated 1y ago
Bring portraits to life!
@cocktailpeanut0 check-insNVIDIAAMDApple
daswer123/xtts-api-serverupdated 1y ago
A simple FastAPI Server to run XTTSv2
0 check-insNVIDIAAMDApple
Feedjer/stable-diffusion-webui-ux.pinokiov1.5updated 1y ago
Stable Diffusion web UI UX: https://github.com/anapnoe/stable-diffusion-webui-ux
4 check-insNVIDIAAMDApple
Feedjer/AniPortrait.pinokiov1.5updated 1y ago
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation:https://github.com/Zejun-Yang/AniPortrait
0 check-insNVIDIAAMDApple
Feedjer/Langflow.pinokiov1.5updated 1y ago
Langflow is a dynamic graph where each node is an executable unit. Its modular and interactive design fosters rapid experimentation and prototyping, pushing hard on the limits of creativity: https://github.com/langflow-ai/langflow
0 check-insNVIDIAAMDApple
Feedjer/HunyuanDiT.pinokiov1.5updated 1y ago
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding/ https://github.com/Tencent/HunyuanDiT
0 check-insNVIDIAAMDApple
Feedjer/Omost.pinokiov1.5updated 1y ago
Your image is almost there!:https://github.com/lllyasviel/Omost
0 check-insNVIDIAAMDApple