Explore tags
🎙️ Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning. High-quality text-to-speech synthesis supporting zero-shot voice cloning and streaming inference with natural emotional expression.
AI Toolkit by Ostris
EbSynth in Python, version 2
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
Offline inference engine for art, real-time voice conversations, LLM powered chatbots and automated workflows
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
Voice Synthesis Platform with Smart Chunking, Batch Processing, and Voice Cloning capabilities.
即梦Dreamina free api,适配手机浏览器
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
调用Seedream 4.0的api服务实现本地生图。A custom node for ComfyUI to generate images using Volcano Engine's Seedream API.
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
Kokoro, KittenTTS, Higgs audio, Chatterbox/Multi, Fish-Speech, F5 & index-tts & indextts2, VoxCPM and VibeVoice in one app
Fast Speech-to-Text Web UI with Apple MLX and OpenAI Whisper
🎙️ Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning. Features 44.1kHz sampling rate, 6.25Hz token rate, and supports both SFT and LoRA fine-tuning. Built on MiniCPM-4 backbone for highly expressive, natural speech synthesis.
zuluCrypt is a front end to cryptsetup and tcplay and it allows easy management of encrypted block devices
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
自动化上传视频到社交媒体:抖音、小红书、视频号、tiktok、youtube、bilibili
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
Contribute to TensorStack-AI/AmuseAI development by creating an account on GitHub.
Check-insNo check-ins yet
Platforms
GPUNVIDIAAMDApple
DiaFeatured
Dia is a 1.6B parameter text to speech model created by Nari Labs. Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control. The model can also produce nonverbal communications like laughter, coughing, clearing throat, etc. https://github.com/nari-labs/dia
[v0.5.1] FramePack Video App offering multiple generation types: Original, F1, video extension, end frame. Features include: LoRA support, job queueing, advanced timestamped prompts, offline mode, a post-processing suite including upscaling, interpolation, filters and more!
A web interface for managing and interacting with Ollama models
A web interface for managing and interacting with Ollama models
zonosFeatured
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS providers. https://github.com/Zyphra/Zonos
Automatically create music videos. Synchronize the cuts to the music's beat.
A Step Towards Music Generation Foundation Model