Learning Vector Quantization

TurboQuant: Reducing LLM Memory Usage With Vector Quantization

Large language models (LLMs) aren’t actually giant computer brains. Instead, they are effectively massive vector spaces in ...

XDA Developers on MSN

TurboQuant tackles the hidden memory problem that's been limiting your local LLMs

A paper from Google could make local LLMs even easier to run.

11d

TurboQuant Has The Potential To Fundamentally Change How Search (And AI) Works

Learn why Google’s TurboQuant may mark a major shift in search, from indexing speed to AI-driven relevance and content discovery.

TweakTown

Google's TurboQuant cuts AI working memory by 6x, but it won't fix the global RAM shortage

TL;DR: Google developed three AI compression algorithms-TurboQuant, PolarQuant, and Quantized Johnson-Lindenstrauss-that reduce large language models' KV cache memory by at least six times without ...

Forbes

Google’s TurboQuant Compression Could Increase Demand For AI Memory

This voice experience is generated by AI. Learn more. This voice experience is generated by AI. Learn more. On March 24, 2026 Amir Zandieh and Vahab Mirrokni from Google Research published an article ...

TechCrunch

Google unveils TurboQuant, a new AI memory compression algorithm — and yes, the internet is calling it ‘Pied Piper’

If Google’s AI researchers had a sense of humor, they would have called TurboQuant, the new, ultra-efficient AI memory compression algorithm announced Tuesday, “Pied Piper” — or, at least that’s what ...

SDxCentral

TurboQuant: Did Google just drop a compression algorithm capable of stemming RAMageddon?

Google thinks it's found the answer, and it doesn't require more or better hardware. Originally detailed in an April 2025 paper, TurboQuant is an advanced compression algorithm that’s going viral over ...

16don MSN

Google reveals algorithms to address AI memory challenges; memory and storage stocks drop

Google (GOOG)(GOOGL) revealed a set of new algorithms today designed to reduce the amount of memory needed to run large language models and vector search engines. Shares of major memory and storage ...

IEEE

Codebook Transfer With Vision-to-Language Translation for Vector Quantization

Abstract: Vector quantization (VQ) is a fundamental research problem in image synthesis, which aims to represent an image with a discrete token sequence. Existing studies effectively address this ...

SpaceNews

Phantom Space reclaims former Vector launch technology

TAMPA, Fla. — Remnants of Vector Launch have made it back to one of its original architects after Phantom Space bought launch assets that were sold off in 2020 during the small rocket developer’s ...

marktechpost

NVIDIA AI Brings Nemotron-3-Nano-30B to NVFP4 with Quantization Aware Distillation (QAD) for Efficient Reasoning Inference

The model is pre-trained on 25T tokens using a Warmup Stable Decay learning rate schedule with a batch size of 3072, a peak learning rate of 1e-3 and a minimum learning rate of 1e-5. The NVFP4 ...

GitHub

[ICCV 2025] D³QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection

Integrates dynamic codebook frequency statistics into a transformer attention module. Fuses semantic image features with latent representations of quantization ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results