Pinokio

Type:api

Platform:All

GPU:All

Tag:#ttsx

Latest Check-ins Name

Sort:Latest

Maestro

Blizaine/Maestrov8.0updated 47 min ago

An all-in-one, 100% local AI video, image & music studio. Its Director mode turns a single prompt into a full music video or short film — LLM-planned, shot by shot. Built on the WanGP pipeline (Wan 2.1/2.2, LTX-2.3, Qwen, Hunyuan Video, Flux). Requires an NVIDIA GPU (6GB+ VRAM).

#ai

@blizaine

49 check-insNVIDIAAMDApple

Uncensored Local Studio

cocktailpeanut/uncensored-local-studio.pinokiov8.0updated 1d ago

Run image generation, GGUF language models, Whisper speech recognition, and Kokoro speech synthesis locally from one offline studio.

#ai #gguf #image-generation #llm #speech-to-text #text-generation #text-to-speech #transcription #tts

@cocktailpeanut 4 check-insNVIDIAAMDApple

Wan2GP

pinokiofactory/wanv3.7updated 5d ago

Super Optimized Gradio UI for AI video creation for GPU poor machines (6GB+ VRAM). Supports Wan 2.1/2.2, Qwen, Hunyuan Video, LTX Video and Flux. https://github.com/deepbeepmeep/Wan2GP

#video-generation #wan #wan2gp #video #image #ai #ai-video-generator #1 #image-generation #gradio

283 check-insNVIDIAAMDApple

Qwen3-TTS

SUP3RMASS1VE/Qwen3-TTS-Pinokiov5.0updated 10d ago

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team

#tts #voice #qwen3-tts #ai

@sup3rmass1ve

30 check-insNVIDIAAMDApple

DramaBox

PierrunoYT/DramaBox-TTS-Pinokiov5.0updated 15d ago

Expressive TTS with voice cloning, prompt-driven speech synthesis built on LTX-2.3 by Resemble AI

#ai #tts #voice-clone

@pierrunoyt

5 check-insNVIDIAAMDApple

Ultimate-TTS-Studio

pinokiofactory/Ultimate-TTS-Studiov3.7updated 18d ago

Kokoro, KittenTTS, Higgs audio, Chatterbox/Multi, Fish-Speech, F5 & index-tts & indextts2, VoxCPM and VibeVoice in one app

#tts #ai #gradio #voice

42 check-insNVIDIAAMDApple

Comfyui

pinokiofactory/comfyv3.7updated 21d ago

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface. https://github.com/comfyanonymous/ComfyUI

#comfyui #ai #video #comfy #image #image-generation #audio #video-generation #node-interface

98 check-insNVIDIAAMDApple

Qwen3-TTS MLX WebUI Enhanced

Blizaine/Qwen3-TTS-MLX-WebUI-Enhancedv5.0updated 23d ago

High-quality text-to-speech with Beautiful Web UI & API, optimized for Apple Silicon using MLX. Features include Custom Voice (preset speakers), Voice Design (natural language), and Voice Cloning. With enhanced features for saving custom voices and long-form / endless TTS streaming.

#mlx #qwen #tts #ai #mac

@blizaine

63 check-insNVIDIAAMDApple

Orpheus-TTS-FastAPI

pinokiofactory/Orpheus-TTS-FastAPIv3.7updated 26d ago

Orpheus TTS is an open-source text-to-speech system built on the Llama-3b backbone. Orpheus demonstrates the emergent capabilities of using LLMs for speech synthesis https://github.com/canopyai/Orpheus-TTS

#ai #tts

0 check-insNVIDIAAMDApple

Whisper-WebUI

pinokiofactory/whisper-webuiv3.7updated 28d ago

A Web UI for easy subtitle using whisper model.

#whisper #ai #gradio #tts

5 check-insNVIDIAAMDApple

Voicebox

cocktailpeanut/voicebox.pinokiov5.0updated 1mo ago

Local-first voice synthesis studio powered by Qwen3-TTS.

#tts #voice-clone

@cocktailpeanut

34 check-insNVIDIAAMDApple

zonos

pinokiofactory/zonosv3.7updated 1mo ago

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS providers. https://github.com/Zyphra/Zonos

#ai #tts

6 check-insNVIDIAAMDApple

Dia

pinokiofactory/diav3.7updated 1mo ago

Dia is a 1.6B parameter text to speech model created by Nari Labs. Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control. The model can also produce nonverbal communications like laughter, coughing, clearing throat, etc. https://github.com/nari-labs/dia

#ai #tts

0 check-insNVIDIAAMDApple

e2-f5-tts

pinokiofactory/e2-f5-ttsv3.7updated 1mo ago

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching https://huggingface.co/spaces/mrfakename/E2-F5-TTS

#tts #voice-clone #ai

16 check-insNVIDIAAMDApple

VoxCPM

IAnMove/voxcpm2-pinokio-launcherv7.0updated 2mo ago

Tokenizer-free multilingual TTS and voice cloning with low-VRAM and VoxCPM2 Web UI/API launch modes.

#ai #tts

@theinaog

2 check-insNVIDIAAMDApple

VibeVoice Realtime

pinokiofactory/vibevoice-realtimev5.0updated 2mo ago

Realtime streaming TTS demo using microsoft/VibeVoice-Realtime-0.5B

#ai #tts

5 check-insNVIDIAAMDApple

OpenAudio

pinokiofactory/openaudiov3.7updated 2mo ago

Multilingual Text-to-Speech with Voice Cloning (Supports: English, Japanese, Korean, Chinese, French, German, Arabic, and Spanish) https://github.com/fishaudio/fish-speech

#openaudio #ai #audio #gradio #tts