Pinokio

Pierre Bruno

@pierrunoyt
3 posts21 checkpointsJoined 1/27/2026, 9:46:34 AM
Creations by @pierrunoyt
50 total
Soprano TTSUpdated 3 weeks ago
https://github.com/PierrunoYT/soprano-tts-pinokio
Instant, Ultra-Realistic Text-to-Speech
KittenTTS 😻Updated 3 weeks ago
https://github.com/PierrunoYT/KittenTTS-Pinokio
Ultra-lightweight text-to-speech (15M-80M params) — CPU optimized, 8 voices, ONNX-powered
Liquid AudioUpdated 3 weeks ago
https://github.com/PierrunoYT/liquid-audio-pinokio
Liquid Audio - LFM2.5-Audio-1.5B: speech-to-speech, ASR, and TTS powered by Liquid AI.
VoxCPM 2Updated 3 weeks ago
https://github.com/PierrunoYT/VoxCPM-2-Pinokio
Tokenizer-free TTS for context-aware speech, voice cloning, and voice design. 2B params, 48kHz, 30 languages (Gradio UI).
LFM2.5-450M-VLUpdated 3 weeks ago
https://github.com/PierrunoYT/LFM2.5-450M-VL-Pinokio
LFM2.5-VL-450M (Liquid AI): compact vision–language model for image understanding. Gradio UI with upload/URL, prompt, and generation sliders.
Z-Image-TurboUpdated 3 weeks ago
https://github.com/PierrunoYT/Z-Image-Pinokio
⚡️ Efficient 6B parameter image generation model with sub-second inference. Generate high-quality, photorealistic images with only 8 inference steps. Features bilingual text rendering (Chinese & English) and Single-Stream Diffusion Transformer architecture.
OmniVoiceUpdated 3 weeks ago
https://github.com/PierrunoYT/OmniVoice-Pinokio
Zero-shot multilingual TTS (600+ languages) with voice cloning and voice design — Gradio UI (app/app.py)
Cohere TranscribeUpdated 3 weeks ago
https://github.com/PierrunoYT/cohere-transcribe-pinokio
State-of-the-art open-source speech recognition model supporting 14 languages. 2B parameter ASR model from Cohere Labs.
GLM-TTSUpdated 3 weeks ago
https://github.com/PierrunoYT/GLM-TTS-Pinokio
🎙️ Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning. High-quality text-to-speech synthesis supporting zero-shot voice cloning and streaming inference with natural emotional expression.
OrpheusTTSUpdated 3 weeks ago
https://github.com/PierrunoYT/OrpheusTTS-Pinokio
Standalone Text-to-Speech using Orpheus TTS with a Gradio UI
LuxTTS 🎙️Updated 3 weeks ago
https://github.com/PierrunoYT/LuxTTS-Pinokio
High-quality rapid TTS voice cloning model (150x+ realtime) — 48kHz speech, voice cloning
Audio Flamingo 3Updated 3 weeks ago
https://github.com/PierrunoYT/Audio-Flamingo-3-Pinokio
NVIDIA's Audio Flamingo 3 - Large Audio-Language Model for speech, sound, and music understanding with Gradio web interface
Supertonic 3Updated 3 weeks ago
https://github.com/PierrunoYT/Supertonic-3-Pinokio
Lightning-Fast, On-Device, Multilingual TTS — Gradio, ONNX, 44.1kHz
Hy-MT2Updated 3 weeks ago
https://github.com/PierrunoYT/Tencent-HY-MT2-Pinokio
Hy-MT2 multilingual translation — Gradio UI for 33-language translation with Hy-MT2-1.8B, Hy-MT2-7B, and Hy-MT2-30B-A3B.
ChatterBoxUpdated 3 weeks ago
https://github.com/PierrunoYT/chatterbox-tts-pinokio
AI-Powered Text-to-Speech with Voice Cloning using Chatterbox TTS and a Gradio interface. Includes Turbo, Multilingual (23+ languages), and Original models. Runs locally; CUDA GPU recommended, CPU supported. Windows, Mac, and Linux.
PersonaPlexUpdated 3 weeks ago
https://github.com/PierrunoYT/PersonaPlex-Pinokio
🗣️ PersonaPlex - NVIDIA's real-time speech-to-speech conversational AI model. Natural full-duplex conversations with customizable personas and voices.
SanaUpdated 3 weeks ago
https://github.com/PierrunoYT/Sana-Pinokio
Fast Image Generation with Sana Diffusion Model
PocketTTSUpdated 3 weeks ago
https://github.com/PierrunoYT/pocket-tts-pinokio
Lightweight CPU text-to-speech with preset voices and optional Hugging Face-authenticated voice cloning.
ChatterBoxUpdated 3 weeks ago
https://github.com/PierrunoYT/chatterbox-tts-app
AI-Powered Text-to-Speech with Voice Cloning using Chatterbox TTS and Gradio interface. Includes Turbo, Multilingual (23+ languages), and Original models.
PiDUpdated 4 weeks ago
https://github.com/PierrunoYT/NVIDIA-PiD-Pinokio
NVIDIA PiD — Pixel Diffusion Decoder for high-resolution latent decoding. Gradio UI for Z-Image + 4× PiD upscale, plus CLI demos for Flux.