Pinokio

atomic-llama-cpp-turboquant

https://github.com/atomicbot-ai/atomic-llama-cpp-turboquantupdated 5/13/2026, 5:26:59 PMindexed 6/6/2026, 7:05:25 PM

llama.cpp fork with TurboQuant WHT-rotated KV cache & weight compression + Gemma 4 MTP and Qwen 3.6 NextN speculative decoding (+30-50% throughput).

Pinokio Apps Using This Repo
No Pinokio apps using this repo yet.
Community tagsLoading...