Explore tags
High-Quality Voice Cloning TTS for 600+ Languages
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
test to get this app working on pinokio
Pinokio launcher wrapper for Claw Code with non-interactive setup and model/provider selection.
Pre-mastering & audio enhancement for AI-generated music. 12-stage processing chain with platform presets (Suno, Udio), before/after spectrogram, and broadcast-ready LUFS normalization.
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
🎵 YouTube to MP3 downloader with a simple Gradio UI. Paste a YouTube link to download MP3. Requires ffmpeg installed on your system.
High-quality rapid TTS voice cloning model (150x+ realtime) — 48kHz speech, voice cloning
State-of-the-art open-source speech recognition model supporting 14 languages. 2B parameter ASR model from Cohere Labs.
🎨 FLUX.2 [klein] - Fast text-to-image generation with Black Forest Labs' FLUX.2 models. 6 variants available: 4B/9B (full precision) plus NVFP4/FP8 quantized versions. Consumer GPUs (~13GB) to high-end (~29GB) for sub-second image generation with outstanding quality.
Fast Image Generation with Sana Diffusion Model
⚡️ Efficient 6B parameter image generation model with sub-second inference. Generate high-quality, photorealistic images with only 8 inference steps. Features bilingual text rendering (Chinese & English) and Single-Stream Diffusion Transformer architecture.
Kimodo: Kinematic Motion Diffusion Model. Generates high-quality 3D human and robot motions.
Instant, Ultra-Realistic Text-to-Speech
A web interface for the Moondream3 vision-language model featuring image captioning, visual question answering, object detection, and object pointing.
Open-source social media scheduling tool with AI. Schedule posts to X, LinkedIn, Reddit, Discord, Threads, TikTok, YouTube, Pinterest, Dribbble, Slack, Mastodon, Facebook, GitHub, and more.
flux-webuiFeatured
Minimal Flux Web UI powered by Gradio & Diffusers (Flux Schnell + Flux Merged)
Check-ins4 check-ins
NVIDIA's Audio Flamingo 3 - Large Audio-Language Model for speech, sound, and music understanding with Gradio web interface
Gradio web interface for Photoroom's PRX-1024-t2i-beta text-to-image model
Automate browser based workflows with AI
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple