Store
Explore tags
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
🔊 PocketTTS - A lightweight, CPU-optimized Text-to-Speech (TTS) application by Kyutai Labs. Generate natural-sounding speech with low latency (~200ms), voice cloning support, and 6x real-time performance on CPU. 100M parameter model with 8 preset voices and custom voice cloning. English only. No GPU required!
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Enter a description for your character (using the trigger word “img”) and optionally upload face photos, then provide a list of prompts—one per line—for each scene. The app creates a series of imag...
Open-source alternative to Higgsfield AI — Free AI image generation & cinema studio with 20+ models (Flux, SDXL, Midjourney, Ideogram). Self-hosted, customizable, MIT licensed.

Batch resize images to predefined sizes (512px, 768px, 1024px) while maintaining aspect ratio
Kortix – build, manage and train AI Agents.
generate a video from an image with a text prompt
BiRefNet for background removal
ComfyUI-Qwen3-TTS brings Alibaba's powerful Qwen3-TTS models to ComfyUI Multi-GPU Support: CUDA, Apple Silicon (MPS), Intel Arc (XPU), and CPU
MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis. It features long-context modeling, flexible speaker control, and multilingual support, while enabling zero-shot voice cloning from short audio references.
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.
OneTrainer is a one-stop solution for all your Diffusion training needs.
Open-weights voice acting pipeline combining zero-shot voice cloning with natural-language direction. Provide a reference voice (or generate one) and describe how the line should be performed. Produces speech that keeps the voice identity while following emotional and stylistic prompts—no training required.
✨ Generate & translate subtitles in 100+ languages!
Nano Banana for Hugging Face PRO users