Golomb Coding

Golomb coding - Wikipedia

Jump to content

Donate

Create account

Personal tools

Donate

Create account

Golomb coding

14 languages

Català Čeština Deutsch Español فارسی Suomi Français 日本語 한국어 Polski Português Русский Українська 中文

Edit links

From Wikipedia, the free encyclopedia

Not to be confused with Exponential-Golomb coding.

Lossless data compression method

This section needs additional citations for verification . Relevant discussion may be found on the talk page. Please help improve this article by adding citations to reliable sources in this section. Unsourced material may be challenged and removed. Find sources: "Golomb coding" – news · newspapers · books · scholar · JSTOR (October 2024) (Learn how and when to remove this message)

Golomb coding is a lossless data compression method using a family of data compression codes invented by Solomon W. Golomb in the 1960s. Alphabets following a geometric distribution will have a Golomb code as an optimal prefix code,[1] making Golomb coding highly suitable for situations in which the occurrence of small values in the input stream is significantly more likely than large values.

Rice coding [edit]

Rice coding (invented by Robert F. Rice) denotes using a subset of the family of Golomb codes to produce a simpler (but possibly suboptimal) prefix code. Rice used this set of codes in an adaptive coding scheme; "Rice coding" can refer either to that adaptive scheme or to using that subset of Golomb codes. Whereas a Golomb code has a tunable parameter that can be any positive integer value, Rice codes are those in which the tunable parameter is a power of two. This makes Rice codes convenient for use on a computer, since multiplication and division by 2 can be implemented more efficiently in binary arithmetic.

Rice was motivated to propose this simpler subset due to the fact that geometric distributions are often varying with time, not precisely known, or both, so selecting the seemingly optimal code might not be very advantageous.

Rice coding is used as the entropy encoding stage in a number of lossless image compression and audio data compression methods.

Overview [edit]

Golomb coding example for a source x with geometric distribution, with parameter p(0) = 0.2, using Golomb code with M = 3. The 2-bit code 00 is used 20% of the time; the 3-bit codes 010, 011, and 100 are used over 38% of the time; 4 bits or more are needed in a minority of cases. For this source, entropy = 3.610 bits; for this code with this source, rate = 3.639 bits; therefore redundancy = 0.030 bits, or efficiency = 0.992 bits per bit. Construction of codes [edit]

Golomb coding uses a tunable parameter M to divide an input value x into two parts: q, the result of a division by M, and r, the remainder. The quotient is sent in unary coding, followed by the remainder in truncated binary encoding. When

{\displaystyle M=1}

, Golomb coding is equivalent to unary coding.

Golomb–Rice codes can be thought of as codes that indicate a number by the position of the bin (q), and the offset within the bin (r). The example figure shows the position q and offset r for the encoding of integer x using Golomb–Rice parameter M = 3, with source probabilities following a geometric distribution with p(0) = 0.2.

Formally, the two parts are given by the following expression, where x is the nonnegative integer being encoded:

{\displaystyle q=\left\lfloor {\frac {x}{M}}\right\rfloor }

and

{\displaystyle r=x-qM.}

This image shows the redundancy, in bits, of the Golomb code, when M is chosen optimally, for 1 − p(0) ≥ 0.45 Both q and r will be encoded using variable numbers of bits: q by a unary code, and r by b bits for Rice code, or a choice between b and b+1 bits for Golomb code (i.e. M is not a power of 2), with

log

{\displaystyle b=\lfloor \log _{2}(M)\rfloor }

. If

{\displaystyle r

, then use b bits to encode r; otherwise, use b+1 bits to encode r. Clearly,

log

{\displaystyle b=\log _{2}(M)}

if M is a power of 2 and we can encode all values of r with b bits.

The integer x treated by Golomb was the run length of a Bernoulli process, which has a geometric distribution starting at 0. The best choice of parameter M is a function of the corresponding Bernoulli process, which is parameterized by

{\displaystyle p=P(x=0)}

the probability of success in a given Bernoulli trial. M is either the median of the distribution or the median ±1. It can be determined by these inequalities:

{\displaystyle (1-p)^{M}+(1-p)^{M+1}\leq 1

which are solved by

log

{\displaystyle M=\left\lceil -{\frac {\log(2-p)}{\log(1-p)}}\right\rceil .}

For the example with p(0) = 0.2:

log 1.8

log 0.8

2.634

{\displaystyle M=\left\lceil -{\frac {\log(1.8)}{\log(0.8)}}\right\rceil =\left\lceil 2.634\right\rceil =3.}

The Golomb code for this distribution is equivalent to the Huffman code for the same probabilities, if it were possible to compute the Huffman code for the...

Golomb Coding

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

SpaceX not the behemoth everyone thought

The Mirror Is Part of the Machine

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits