Riiven Threads

JPEG

You've never seen a JPEG. You've seen what five sciences agreed to let your eyes notice.

5 Fields Converge

Every photo on this screen has lost about 90% of its original data. You will never notice, and that fact is deliberate. The decision about what to delete and what to keep was made by five sciences, each minding its own problem: a 1931 color chart, a 1974 math trick, a 1968 vision experiment, a 1960 engineering paper, and a 1952 packing algorithm. JPEG is not a compression format. It is a working model of your eye, written into a file by people working decades and disciplines apart.

64coefficients
Only 4 to 8 of them matter after the DCT sort; the rest collapse.
50%
About half of all DCT data is detail your eyes cannot resolve anyway.
2:1
Maximum compression without quantization; the real gains live entirely there.
30%
Extra file weight every JPEG would carry if Huffman packing were skipped.

When the fields matured

Each field had to produce a specific result before JPEG could exist as you know it. The timeline below shows when each one arrived.

Gold dashed line: ISO/IEC 10918-1 publishes the JPEG specification, 1992. Each dot marks when a field matured to produce what JPEG required. Hover or tap a dot for detail.

Pull any thread, and the same story unravels.

Sorted by maturation year, from the oldest foundation to the newest refinement.

01

Keystone

The math that sorts every patch of sky

Discrete Cosine Transform math matured 1974 Nasir Ahmed, T. Natarajan, K. R. Rao

Most of a photo is boring. A 1974 math trick can prove it, block by block.

Sort an 8×8 patch of pixels into 'smooth parts' (sky, skin) and 'busy parts' (eyelashes, leaves). Most photos are 90% smooth. The DCT does this sort in microseconds, and once a patch is sorted, the boring parts collapse to almost nothing. JPEG's 10-to-1 size cut starts here.

Without this field

Without the DCT, JPEG has no way to separate perceptually important information (low-frequency structure) from unimportant detail (high-frequency noise). Lossy compression of raw pixel blocks at 10:1 produces visible noise from the first bit discarded.

After DCT, typical photo blocks have only 4 to 8 significant coefficients out of 64. That is a 10x data reduction before any quantization.

How we know

The 8×8 DCT transforms a block of spatial pixel values into 64 frequency coefficients. Natural images concentrate energy in low frequencies, so after DCT most coefficients become small and cheap to encode. JPEG's 10:1 compression depends entirely on this energy compaction.

Source: Discrete Cosine Transform (1974) · tier1

Sorting the data is only useful if you know which sorted parts the eye actually needs.

02

Striped patterns that mapped where vision goes blind

Human Visual Perception biology matured 1968 Fergus W. Campbell, John G. Robson

Two scientists in 1968 measured exactly what your eyes are blind to. JPEG throws away that, and only that.

Campbell and Robson asked people to look at striped patterns until the stripes blurred together, then mapped, in numbers, the resolution your eyes can and cannot see. JPEG keeps the data your eyes can resolve and quietly deletes the data they cannot. That deletion is the loss in 'lossy' compression. The file does not shrink arbitrarily. It shrinks exactly where vision is blind.

Without this field

Without the contrast sensitivity function, JPEG has no principled way to decide which DCT coefficients to keep. Quantization without HVS data discards luminance information indiscriminately, producing visible blur rather than imperceptible loss at the same compression ratio.

About half the data inside a JPEG is the part your eyes can't resolve. Quantization deletes exactly that.

How we know

Campbell and Robson (1968) measured the contrast sensitivity function: human eyes respond sharply at 2 to 4 cycles per degree and fall off rapidly above that. JPEG's quantization matrix discards high-frequency DCT coefficients precisely because the eye cannot resolve them.

Source: Application of Fourier analysis to the visibility of gratings (1968) · tier1

Knowing what the eye misses tells you where to delete aggressively and where to go gently.

03

Rounding numbers where your eyes will never notice

Quantization Theory engineering matured 1960 Joel Max

This is the step where JPEG actually deletes the parts of your photo it decided you wouldn't miss.

Quantization is the only place a JPEG truly loses information. Every other step rearranges bits; this step rounds them: aggressively where the eye is blind, gently where the eye is sharp. Without it, files shrink at most 2×. With it, they shrink 10× to 50× with no visible loss.

Without this field

Without quantization, JPEG's compression ratio is fundamentally limited to ~2:1 (the DCT's energy compaction without bit reduction). Everything beyond that (the 10:1 to 50:1 ratios consumers actually use) comes from quantization discarding coefficient precision.

Lossless JPEG: ~2:1 compression. With quantization: 10:1 to 50:1.

How we know

Quantization is JPEG's only lossy step: divide each DCT coefficient by a quantizer, then round to integer. Max (1960) proved how to minimize expected distortion for a given number of levels. JPEG's standard quantization matrices are hand-tuned versions of Max's result, using HVS data: aggressive for high frequencies, gentle for low.

Source: Quantizing for Minimum Distortion (1960) · tier1

Once detail is selectively discarded, brightness and color still need to be split before packing.

04

A coloring-book layer your eyes trust completely

Color Science physics matured 1931 Commission Internationale de l'Éclairage

A photo's brightness matters more to your eyes than its color. JPEG cuts the file in half before doing anything else, just by knowing this.

Imagine a black-and-white photo with a thin coloring-book layer on top. JPEG splits every photo into exactly that: a sharp brightness layer (Y) and a softer color layer (Cb, Cr). It keeps the brightness at full resolution and halves the color resolution. Your eyes do not notice. The file is 50% smaller before any 'compression' has happened.

Without this field

Without perceptual color spaces, JPEG would compress in RGB, treating all three channels as equally important. Compression artifacts would manifest as colorband shifts rather than luminance noise, destroying image structure at modest compression ratios.

Splitting brightness from color and halving the color half: 50% smaller file, eye notices nothing.

How we know

CIE 1931 quantified how wavelengths of light map to perceived color, defining the XYZ color space and the trichromatic matching functions. JPEG uses a derived space, YCbCr, which separates luminance (Y) from chrominance (Cb, Cr). This separation enables chroma subsampling: keep Y at full resolution, halve the resolution of color. File shrinks 50% before anything else happens.

Source: CIE 1931 2° Standard Observer (1931) · tier1

With brightness separated and color halved, whatever remains still needs to be packed as tightly as possible.

05

Short codes for common patterns, long ones for rare

Huffman Coding computer science matured 1952 David A. Huffman

After all the deletion, what's left needs to be packed. A 1952 algorithm packs it almost perfectly.

Common patterns get short codes; rare patterns get long codes. Huffman's algorithm does this packing optimally, within a fraction of a bit of the smallest size mathematically possible for the data that's left. Skip this step and JPEGs would be roughly 30% larger for no visible benefit.

Without this field

Without variable-length entropy codes, JPEG would fall back to fixed-length encoding of the quantized coefficient stream, wasting 25 to 30% of file size. Huffman is what makes the final compression step nearly optimal.

Without smart packing of what's left, every JPEG would be ~30% larger for no visible reason.

How we know

Huffman's 1952 algorithm produces optimal prefix codes for any probability distribution: shorter bit sequences for common symbols, longer for rare ones. JPEG uses Huffman tables tuned to typical quantized-DCT-coefficient statistics, compressing the final stage to within a fraction of a bit of Shannon's theoretical minimum.

Source: A Method for the Construction of Minimum-Redundancy Codes (1952) · tier1

Watch

A visual companion to the fields above.

JPEG DCT Explained

Computerphile

Every image on every site you have ever loaded was filtered through a five-stage model of your own perception, built by people working decades and disciplines apart. A color committee from before television. A vision experiment from before the moon landing. A math paper from the year of the Watergate break-in. None of them woke up wanting to compress your beach photo. The most powerful engineering you encounter every day is the engineering you cannot see, by design. The lesson isn't that JPEG is clever. It's that the things you 'just see' on the internet are decisions made about your eyes, on your behalf, by people who are mostly dead now.

References

  1. Discrete Cosine Transform (1974) tier1

    Ahmed, Natarajan & Rao, IEEE Transactions on Computers vol. C-23 (1974). The paper that introduced the DCT. Every JPEG encoder still uses this specific transform.

  2. Application of Fourier analysis to the visibility of gratings (1968) tier1

    Campbell & Robson, Journal of Physiology vol. 197 (1968). Established that human visual sensitivity drops sharply above 2 to 4 cycles per degree, which is exactly what JPEG exploits.

  3. Quantizing for Minimum Distortion (1960) tier1

    Joel Max, IRE Transactions on Information Theory vol. IT-6 (1960). Established the optimal-quantizer design that JPEG's quantization matrices approximate.

  4. CIE 1931 2° Standard Observer (1931) tier1

    Proceedings of the Commission Internationale de l'Éclairage, 1931. The color-matching functions that quantify human trichromatic perception. Every color space since rests on this foundation.

  5. A Method for the Construction of Minimum-Redundancy Codes (1952) tier1

    David A. Huffman, Proceedings of the IRE vol. 40 (1952). The algorithm every JPEG encoder still uses for its final entropy-coding stage.

Pull a thread. Share it.

Enjoyed this?

New Thread every week. Each one pulls a technology apart and traces it back to the fields of science that made it possible.

By subscribing, you agree to our Privacy Policy . Unsubscribe anytime.