Explore tags
A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics,
A mass video player for easy browsing of large video datasets
Super fast Multilingual TTS supporting 54 voices across 8 languages.
SongBloom, a novel framework for full-length song generation
FooocusFeatured
Minimal Stable Diffusion UI
Check-ins8 check-ins
Platforms
GPUNVIDIAAMDApple
XTTSFeatured
clone voices into different languages by using just a quick 3-second audio clip. (a local version of https://huggingface.co/spaces/coqui/xtts)
[NVIDIA ONLY] Requires 24GB VRAM (Use the lowvram option, it has the same quality). High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models. https://github.com/Tencent/Hunyuan3D-2
Florence2Featured
An advanced vision foundation model from MicroSoft https://huggingface.co/spaces/gokaygokay/Florence-2
(WINDOWS)NVIDIA, Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
[NVIDIA ONLY] Image generation, image editing and free-form manipulation with a VLM (Minimum Requirements 12GB VRAM / 32GB RAM Recommended Requirements 24GB VRAM / 48GB RAM)
[NVIDIA ONLY] [RTX 50 Support] Image generation, image editing and free-form manipulation with a VLM (Minimum Requirements 12GB VRAM / 32GB RAM Recommended Requirements 24GB VRAM / 48GB RAM)
create a story by generating consistent images https://github.com/HVision-NKU/StoryDiffusion
An Open Source Model for Audio Samples and Sound Design https://github.com/Stability-AI/stable-audio-tools
Unify Efficient Fine-Tuning of 100+ LLMs https://github.com/hiyouga/LLaMA-Factory
Describe UI and see it rendered live. Ask for changes and convert HTML to React, Svelte, Web Components, etc. Like vercel v0, but open source https://github.com/wandb/openui
Simple script examples that highlight all the Pinokio APIs
Advanced Gradio UI for Stable Audio https://github.com/RoyalCities/RC-stable-audio-tools
Manage your ComfyUI environments with Docker