What is Model Quantization? Plain-English definition

AI Basics

What is Model Quantization?

Model quantization is a technical process that shrinks the size of an AI model by reducing the precision of the numbers it uses to process information. Think of it like converting a high-resolution photograph into a slightly smaller file format. The model loses a tiny amount of detail, but it becomes much lighter and faster. This allows complex AI systems to run on standard hardware, such as laptops or mobile devices, without needing massive, specialized data centers to function properly.

Why this matters to you

It helps your organization save money and improve privacy. By running smaller models directly on your own office hardware, you avoid the high costs of cloud computing and keep sensitive company data within your own secure network instead of sending it to an external server.

How you might hear this

We used model quantization to ensure our internal chatbot runs smoothly on standard employee workstations without requiring a constant internet connection.

AI Jargon Buster

Search any AI term, explained in plain English.

Type a term below and search. You will be taken straight to the tool.

Related terms

Model Distillation Compute Cost Latency

AI is changing how you get hired

See how your CV performs against the ATS algorithms that screen candidates before a human ever reads your application.

Try the CV Optimiser →

Understand the bigger picture

How AI job displacement actually works, what it means for your career, and what to do about it. Written by someone who has been in recruitment for 25 years.

When the Ground Shifts →

Job hunting right now?

A 39-page guide covering the whole search: beating the ATS, fixing your CV and LinkedIn, and walking into interviews prepared. Written by a recruiter.

Get the Job Search Playbook →

This tool uses AI to generate your results.