Algorithmic Probability

dataflow1 pts0 comments

Algorithmic probability - Wikipedia

Jump to content

Search

Search

Donate

Create account

Log in

Personal tools

Donate

Create account

Log in

Algorithmic probability

5 languages

العربية<br>Català<br>Français<br>日本語<br>한국어

Edit links

From Wikipedia, the free encyclopedia

Mathematical method of assigning a prior probability to a given observation

"Universal probability" redirects here. For other uses, see Universal probability (disambiguation).

From observer states to physics via algorithmic probability. A colorful illustration of Observation 5.4 shows the computational ontological model's possible histories, represented as a tree graph. The vertices correspond to the random variable

{\displaystyle \omega _{t}}

(the computational state at time

{\displaystyle t}

), with valid states (black) forming a growing sequence of bit strings, while invalid states are shown in gray. Alice and Bob, observing an approaching meteorite, are part of the computational process, represented by

{\displaystyle f_{A}(\omega _{t}')}

and

{\displaystyle f_{B}(\omega _{t}')}

, respectively. Their shared emergent reality transitions probabilistically: with 99% likelihood, the meteorite hits Bob3rd, and with 1%, it misses. Postulates 3.2 imply this outcome is irrelevant for Bob1st, whose state transitions elsewhere, as

{\displaystyle f_{B}}

generates a semimeasure, not a measure. The fate of Bob1st lies beyond the scope of Postulates 3.2, awaiting Postulates 3.1.[1]<br>In algorithmic information theory, algorithmic probability , also known as Solomonoff probability , is a mathematical method of assigning a prior probability to a given observation. It was invented by Ray Solomonoff in the 1960s.[2]<br>It is used in inductive inference theory and analyses of algorithms. In his general theory of inductive inference, Solomonoff uses the method together with Bayes' rule to obtain probabilities of prediction for an algorithm's future outputs.[3]

In the mathematical formalism used, the observations have the form of finite binary strings viewed as outputs of Turing machines, and the universal prior is a probability distribution over the set of finite binary strings calculated from a probability distribution over programs (that is, inputs to a universal Turing machine). The prior is universal in the<br>Turing-computability sense, i.e. no string has zero probability. It is not computable, but it can be approximated.[4]

Formally, the probability

{\displaystyle P}

is not a probability and it is not computable. It is only "lower semi-computable" and a "semi-measure". By "semi-measure", it means that

{\displaystyle 0\leq \sum _{x}P(x)

. That is, the "probability" does not actually sum up to one, unlike actual probabilities. This is because some inputs to the Turing machine causes it to never halt, which means the probability mass allocated to those inputs is lost. By "lower semi-computable", it means there is a Turing machine that, given an input string

{\displaystyle x}

, can print out a sequence

{\displaystyle y_{1}

that converges to

{\displaystyle P(x)}

from below, but there is no such Turing machine that does the same from above.

Overview<br>[edit]

Algorithmic probability is the main ingredient of Solomonoff's theory of inductive inference, the theory of prediction based on observations; it was invented with the goal of using it for machine learning; given a sequence of symbols, which one will come next? Solomonoff's theory provides an answer that is optimal in a certain sense, although it is incomputable.

Four principal inspirations for Solomonoff's algorithmic probability were: Occam's razor, Epicurus' principle of multiple explanations, modern computing theory (e.g. use of a universal Turing machine) and Bayes’ rule for prediction.[5]

Occam's razor and Epicurus' principle are essentially two different non-mathematical approximations of the universal prior.

Occam's razor: among the theories that are consistent with the observed phenomena, one should select the simplest theory.[6]

Epicurus' principle of multiple explanations: if more than one theory is consistent with the observations, keep all such theories.[7]

At the heart of the universal prior is an abstract model of a computer, such as a universal Turing machine.[8] Any abstract computer will do, as long as it is Turing-complete, i.e. every computable function has at least one program that will compute its application on the abstract computer.

The abstract computer is used to give precise meaning to the phrase "simple explanation". In the formalism used, explanations, or theories of phenomena, are computer programs that generate observation strings when run on the abstract computer. Each computer program is assigned a weight corresponding to its length. The universal probability distribution is the probability distribution on all possible output strings with random input, assigning for each finite output prefix q the sum of the probabilities of the programs that...

probability universal displaystyle theory turing algorithmic

Related Articles