Pierre Bruno

@pierrunoyt

3 posts21 checkpointsJoined 1/27/2026, 9:46:34 AM

@pierrunoyt

Activity 24 Posts 3 Checkpoints 21 Apps 8 Creations 50 Following 0 Followers 4

Creations by @pierrunoyt

50 total

Soprano TTSUpdated 3 weeks ago

https://github.com/PierrunoYT/soprano-tts-pinokio

Instant, Ultra-Realistic Text-to-Speech

KittenTTS 😻Updated 3 weeks ago

https://github.com/PierrunoYT/KittenTTS-Pinokio

Ultra-lightweight text-to-speech (15M-80M params) — CPU optimized, 8 voices, ONNX-powered

Liquid AudioUpdated 3 weeks ago

https://github.com/PierrunoYT/liquid-audio-pinokio

Liquid Audio - LFM2.5-Audio-1.5B: speech-to-speech, ASR, and TTS powered by Liquid AI.

VoxCPM 2Updated 3 weeks ago

https://github.com/PierrunoYT/VoxCPM-2-Pinokio

Tokenizer-free TTS for context-aware speech, voice cloning, and voice design. 2B params, 48kHz, 30 languages (Gradio UI).

LFM2.5-450M-VLUpdated 3 weeks ago

https://github.com/PierrunoYT/LFM2.5-450M-VL-Pinokio

LFM2.5-VL-450M (Liquid AI): compact vision–language model for image understanding. Gradio UI with upload/URL, prompt, and generation sliders.

Z-Image-TurboUpdated 3 weeks ago

https://github.com/PierrunoYT/Z-Image-Pinokio

⚡️ Efficient 6B parameter image generation model with sub-second inference. Generate high-quality, photorealistic images with only 8 inference steps. Features bilingual text rendering (Chinese & English) and Single-Stream Diffusion Transformer architecture.

OmniVoiceUpdated 3 weeks ago

https://github.com/PierrunoYT/OmniVoice-Pinokio

Zero-shot multilingual TTS (600+ languages) with voice cloning and voice design — Gradio UI (app/app.py)

Cohere TranscribeUpdated 3 weeks ago

https://github.com/PierrunoYT/cohere-transcribe-pinokio

State-of-the-art open-source speech recognition model supporting 14 languages. 2B parameter ASR model from Cohere Labs.

GLM-TTSUpdated 3 weeks ago

https://github.com/PierrunoYT/GLM-TTS-Pinokio

🎙️ Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning. High-quality text-to-speech synthesis supporting zero-shot voice cloning and streaming inference with natural emotional expression.

OrpheusTTSUpdated 3 weeks ago

https://github.com/PierrunoYT/OrpheusTTS-Pinokio

Standalone Text-to-Speech using Orpheus TTS with a Gradio UI

LuxTTS 🎙️Updated 3 weeks ago

https://github.com/PierrunoYT/LuxTTS-Pinokio

High-quality rapid TTS voice cloning model (150x+ realtime) — 48kHz speech, voice cloning

Audio Flamingo 3Updated 3 weeks ago

https://github.com/PierrunoYT/Audio-Flamingo-3-Pinokio

NVIDIA's Audio Flamingo 3 - Large Audio-Language Model for speech, sound, and music understanding with Gradio web interface

Supertonic 3Updated 3 weeks ago

https://github.com/PierrunoYT/Supertonic-3-Pinokio

Lightning-Fast, On-Device, Multilingual TTS — Gradio, ONNX, 44.1kHz

Hy-MT2Updated 3 weeks ago

https://github.com/PierrunoYT/Tencent-HY-MT2-Pinokio

Hy-MT2 multilingual translation — Gradio UI for 33-language translation with Hy-MT2-1.8B, Hy-MT2-7B, and Hy-MT2-30B-A3B.

ChatterBoxUpdated 3 weeks ago

https://github.com/PierrunoYT/chatterbox-tts-pinokio

AI-Powered Text-to-Speech with Voice Cloning using Chatterbox TTS and a Gradio interface. Includes Turbo, Multilingual (23+ languages), and Original models. Runs locally; CUDA GPU recommended, CPU supported. Windows, Mac, and Linux.

PersonaPlexUpdated 3 weeks ago

https://github.com/PierrunoYT/PersonaPlex-Pinokio

🗣️ PersonaPlex - NVIDIA's real-time speech-to-speech conversational AI model. Natural full-duplex conversations with customizable personas and voices.

SanaUpdated 3 weeks ago

https://github.com/PierrunoYT/Sana-Pinokio

Fast Image Generation with Sana Diffusion Model

PocketTTSUpdated 3 weeks ago

https://github.com/PierrunoYT/pocket-tts-pinokio

Lightweight CPU text-to-speech with preset voices and optional Hugging Face-authenticated voice cloning.

ChatterBoxUpdated 3 weeks ago

https://github.com/PierrunoYT/chatterbox-tts-app

AI-Powered Text-to-Speech with Voice Cloning using Chatterbox TTS and Gradio interface. Includes Turbo, Multilingual (23+ languages), and Original models.

PiDUpdated 4 weeks ago

https://github.com/PierrunoYT/NVIDIA-PiD-Pinokio

NVIDIA PiD — Pixel Diffusion Decoder for high-resolution latent decoding. Gradio UI for Z-Image + 4× PiD upscale, plus CLI demos for Flux.

Creations

More · 50

Transcribr

Bulk transcribe many YouTube videos, whole playlists, or your own uploaded audio/video files at once with faster-whisper. Outputs txt, srt, vtt, or json.Updated 8 hours ago

ScribeTube

Download and transcribe many YouTube videos or whole playlists at once with faster-whisper. Outputs txt, srt, vtt, or json.Updated 9 hours ago

MOSS-TTS

All-in-one Gradio UI for the MOSS-TTS Family: voice cloning, dialogue generation, voice design from text, and sound effects.Updated 19 hours ago

Ideogram 4 Studio

Ideogram 4 (nf4) open-weights text-to-image model (9.3B params, Qwen3-VL-8B text encoder, structured JSON prompting, native 2k resolution)Updated 3 days ago

PRX Pixel

Pixel-space PRX text-to-image pipeline (~7B params, Qwen3-VL text encoder, no VAE)Updated 3 days ago

OmniVoice Studio

The open-source ElevenLabs alternative. Local voice cloning, video dubbing, and real-time dictation — 646 languages, no API keys.Updated 6 days ago

Higgs Audio v3 TTS

Pinokio launcher for Higgs Audio v3 TTS with Gradio UI, SGLang-Omni backend, and automatic model download.Updated last week

dots.tts-base

2B-parameter fully continuous, end-to-end autoregressive text-to-speech with zero-shot voice cloning. https://huggingface.co/rednote-hilab/dots.tts-baseUpdated last week

VidLingo

YouTube to MP3, Cohere transcription, TranslateGemma translation, OmniVoice TTS. https://github.com/PierrunoYT/VidLingo-PinokioUpdated 2 weeks ago

DramaBox

Expressive TTS with voice cloning, prompt-driven speech synthesis built on LTX-2.3 by Resemble AIUpdated 3 weeks ago