Store
Explore tags
an open vision-language model by Google. PaliGemma is designed as a versatile model for transfer to a wide range of vision-language tasks such as image and short video caption, visual question answering, text reading, object detection and object segmentation https://huggingface.co/spaces/google/paligemma
Brought to you by Cohee, RossAscends, and the SillyTavern community, SillyTavern is a local-install interface that allows you to interact with text generation AIs (LLMs) to chat and roleplay with custom characters.
Generative AI for Professional Creatives
Customized ID Consistent for human: https://github.com/JackAILab/ConsistentID
Stable Diffusion web UI UX: https://github.com/anapnoe/stable-diffusion-webui-ux
A Gradio UI for XTTSv2 and RVC, allowing for real-time voice conversion.
SuperPrompter is a Python-based application that utilises the SuperPrompt-v1 model to generate optimised text prompts for AI/LLM image generation (for use with Stable Diffusion etc...) from user prompts.
The script utilizes various deep learning models to create detailed character cards, including names, summaries, personalities, greeting messages, and character avatars.
Next generation face swapper and enhancer
llama.cpp with BakLLaVA model describes what does it see (https://github.com/Fuzzy-Search/realtime-bakllava)
Stable Diffusion UI with patches by lllyasviel
Flexible Automapper for Beatsaber made for any difficulty
moondream1 is a tiny (1.6B parameter) vision language model trained by @vikhyatk that performs on par with models twice its size. It is trained on the LLaVa training dataset, and initialized with SigLIP as the vision tower and Phi-1.5 as the text encoder. https://huggingface.co/spaces/vikhyatk/moondream1

Massively Multilingual Speech (MMS): Persian Text-to-Speech

Efficiently separate audio tracks with Spleeter
Next generation face swapper and enhancer
Unlock the new experience of Housing App Android Setup with Automation using Pinokio
A Realtime Creation Engine