Store
Direct3D-S2
[NVIDIA ONLY] Direct3D-S2 is a scalable 3D shape generation framework leveraging sparse volumetric representations for high-resolution outputs. It features Spatial Sparse Attention (SSA), a novel mechanism that accelerates Diffusion Transformer computations on sparse data, achieving up to 9.6× speedup in training. The unified Sparse VAE architecture maintains a consistent sparse volumetric format across input, latent, and output stages, significantly improving efficiency and stability.
🎬 AutoGif
Transform YouTube videos into stunning animated GIFs with perfectly-timed, stylized subtitles and eye-catching effects.
AIraoke
Transform lyric transcriptions into karaoke-style MP4 videos. Built on Python-Lyric-Transcriber, this Gradio UI uses Whisper for transcription, an LLM for lyric edits, and Demucs for vocal separation. A fun tool for karaoke fans, though outputs may vary.
IC-Light-Studio
This project is an enhanced version of the IC-Light repository, designed for advanced image relighting and enhancement using Stable Diffusion and deep learning techniques
Dia
Dia is a 1.6B parameter text to speech model created by Nari Labs. Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control. The model can also produce nonverbal communications like laughter, coughing, clearing throat, etc. https://github.com/nari-labs/dia
StoryCraft
Generate engaging 1 to 5-minute short stories with LLMs and convert them to audio with Coqui TTS, supports voice cloning, built in speakers and multilingual.
InfiniteYou
[NVIDIA ONLY - WINDOWS ONLY] InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity [LoRA support fork] https://github.com/petermg/InfiniteYou
