turboquant-pytorch

1 results for tag "turboquant-pytorch"

🎯

Skills

1

turboquant-pytorcharadotso/trending-skills5700

From-scratch PyTorch implementation of Google TurboQuant (ICLR 2026) for LLM KV-cache compression: Stage 1 random orthogonal rotation + Lloyd-Max scalar quantization, Stage 2 QJL 1-bit sign residual correction for unbiased inner products — achieving 5x compression at 3-bit (58MB vs 289MB FP16) with 99.5% attention fidelity on Qwen2.5-3B.

turboquant-pytorch