Misplaced Pages

Quantization (image processing)

Article snapshot taken from[REDACTED] with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
(Redirected from Quantization matrix) Lossy compression technique
This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Quantization" image processing – news · newspapers · books · scholar · JSTOR (November 2012) (Learn how and when to remove this message)

Quantization, involved in image processing, is a lossy compression technique achieved by compressing a range of values to a single quantum (discrete) value. When the number of discrete symbols in a given stream is reduced, the stream becomes more compressible. For example, reducing the number of colors required to represent a digital image makes it possible to reduce its file size. Specific applications include DCT data quantization in JPEG and DWT data quantization in JPEG 2000.

Color quantization

Main article: Color quantization

Color quantization reduces the number of colors used in an image; this is important for displaying images on devices that support a limited number of colors and for efficiently compressing certain kinds of images. Most bitmap editors and many operating systems have built-in support for color quantization. Popular modern color quantization algorithms include the nearest color algorithm (for fixed palettes), the median cut algorithm, and an algorithm based on octrees.

It is common to combine color quantization with dithering to create an impression of a larger number of colors and eliminate banding artifacts.

Grayscale quantization

Grayscale quantization, also known as gray level quantization, is a process in digital image processing that involves reducing the number of unique intensity levels (shades of gray) in an image while preserving its essential visual information. This technique is commonly used for simplifying images, reducing storage requirements, and facilitating processing operations. In grayscale quantization, an image with N intensity levels is converted into an image with a reduced number of levels, typically L levels, where L<N. The process involves mapping each pixel's original intensity value to one of the new intensity levels. One of the simplest methods of grayscale quantization is uniform quantization, where the intensity range is divided into equal intervals, and each interval is represented by a single intensity value. Let's say we have an image with intensity levels ranging from 0 to 255 (8-bit grayscale). If we want to quantize it to 4 levels, the intervals would be , , , and . Each interval would be represented by the midpoint intensity value, resulting in intensity levels of 31, 95, 159, and 223 respectively.

The formula for uniform quantization is:

Q ( x ) = x Δ × Δ + Δ 2 {\displaystyle Q(x)=\left\lfloor {\frac {x}{\Delta }}\right\rfloor \times \Delta +{\frac {\Delta }{2}}} Where:

  • Q(x) is the quantized intensity value.
  • x is the original intensity value.
  • Δ is the size of each quantization interval.

Let's quantize an original intensity value of 147 to 3 intensity levels.

Original intensity value: x=147

Desired intensity levels: L=3

We first need to calculate the size of each quantization interval:

Δ = 255 L 1 = 255 3 1 = 127.5 {\displaystyle \Delta ={\frac {255}{L-1}}={\frac {255}{3-1}}=127.5}

Using the uniform quantization formula:

Q ( x ) = 147 127.5 × 127.5 + 127.5 2 {\displaystyle Q(x)=\left\lfloor {\frac {147}{127.5}}\right\rfloor \times 127.5+{\frac {127.5}{2}}}

Q ( x ) = 1.15294118 × 127.5 + 127.5 2 {\displaystyle Q(x)=\left\lfloor 1.15294118\right\rfloor \times 127.5+{\frac {127.5}{2}}}

Q ( x ) = 1 × 127.5 + 63.75 = 191.25 {\displaystyle Q(x)=1\times 127.5+63.75=191.25}

Rounding 191.25 to the nearest integer, we get Q ( x ) = 191 {\displaystyle Q(x)=191}

So, the quantized intensity value of 147 to 3 levels is 191.

Frequency quantization for image compression

The human eye is fairly good at seeing small differences in brightness over a relatively large area, but not so good at distinguishing the exact strength of a high frequency (rapidly varying) brightness variation. This fact allows one to reduce the amount of information required by ignoring the high frequency components. This is done by simply dividing each component in the frequency domain by a constant for that component, and then rounding to the nearest integer. This is the main lossy operation in the whole process. As a result of this, it is typically the case that many of the higher frequency components are rounded to zero, and many of the rest become small positive or negative numbers.

As human vision is also more sensitive to luminance than chrominance, further compression can be obtained by working in a non-RGB color space which separates the two (e.g., YCbCr), and quantizing the channels separately.

Quantization matrices

A typical video codec works by breaking the picture into discrete blocks (8×8 pixels in the case of MPEG). These blocks can then be subjected to discrete cosine transform (DCT) to calculate the frequency components, both horizontally and vertically. The resulting block (the same size as the original block) is then pre-multiplied by the quantization scale code and divided element-wise by the quantization matrix, and rounding each resultant element. The quantization matrix is designed to provide more resolution to more perceivable frequency components over less perceivable components (usually lower frequencies over high frequencies) in addition to transforming as many components to 0, which can be encoded with greatest efficiency. Many video encoders (such as DivX, Xvid, and 3ivx) and compression standards (such as MPEG-2 and H.264/AVC) allow custom matrices to be used. The extent of the reduction may be varied by changing the quantizer scale code, taking up much less bandwidth than a full quantizer matrix.

This is an example of DCT coefficient matrix:

[ 415 33 58 35 58 51 15 12 5 34 49 18 27 1 5 3 46 14 80 35 50 19 7 18 53 21 34 20 2 34 36 12 9 2 9 5 32 15 45 37 8 15 16 7 8 11 4 7 19 28 2 26 2 7 44 21 18 25 12 44 35 48 37 3 ] {\displaystyle {\begin{bmatrix}-415&-33&-58&35&58&-51&-15&-12\\5&-34&49&18&27&1&-5&3\\-46&14&80&-35&-50&19&7&-18\\-53&21&34&-20&2&34&36&12\\9&-2&9&-5&-32&-15&45&37\\-8&15&-16&7&-8&11&4&7\\19&-28&-2&-26&-2&7&-44&-21\\18&25&-12&-44&35&48&-37&-3\end{bmatrix}}}

A common quantization matrix is:

[ 16 11 10 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 ] {\displaystyle {\begin{bmatrix}16&11&10&16&24&40&51&61\\12&12&14&19&26&58&60&55\\14&13&16&24&40&57&69&56\\14&17&22&29&51&87&80&62\\18&22&37&56&68&109&103&77\\24&35&55&64&81&104&113&92\\49&64&78&87&103&121&120&101\\72&92&95&98&112&100&103&99\end{bmatrix}}}

Dividing the DCT coefficient matrix element-wise with this quantization matrix, and rounding to integers results in:

[ 26 3 6 2 2 1 0 0 0 3 4 1 1 0 0 0 3 1 5 1 1 0 0 0 4 1 2 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] {\displaystyle {\begin{bmatrix}-26&-3&-6&2&2&-1&0&0\\0&-3&4&1&1&0&0&0\\-3&1&5&-1&-1&0&0&0\\-4&1&2&-1&0&0&0&0\\1&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0\end{bmatrix}}}

For example, using −415 (the DC coefficient) and rounding to the nearest integer

r o u n d ( 415 16 ) = r o u n d ( 25.9375 ) = 26 {\displaystyle \mathrm {round} \left({\frac {-415}{16}}\right)=\mathrm {round} \left(-25.9375\right)=-26}

Typically this process will result in matrices with values primarily in the upper left (low frequency) corner. By using a zig-zag ordering to group the non-zero entries and run length encoding, the quantized matrix can be much more efficiently stored than the non-quantized version.

See also

References

  1. ^ John Wiseman, An Introduction to MPEG Video Compression, https://web.archive.org/web/20111115004238/http://www.john-wiseman.com/technical/MPEG_tutorial.htm

Data compression methods
Lossless
Entropy type
Dictionary type
Other types
Hybrid
Lossy
Transform type
Predictive type
Audio
Concepts
Codec parts
Image
Concepts
Methods
Video
Concepts
Codec parts
Theory
Community
People
  1. Smith, Steven W. (2003). Digital signal processing: a practical guide for engineers and scientists. Demystifying technology series. Amsterdam Boston: Newnes. ISBN 978-0-7506-7444-7.
Categories: