Poisson-Dirichlet distribution

Definition and first properties of the Poisson-Dirichlet distributions

In probability theory, Poisson-Dirichlet distributions are probability distributions on the set of nonnegative, non-increasing sequences with sum 1, depending on two parameters $\alpha \in [0,1)$ and $\theta \in (-\alpha ,\infty )$ . It can be defined as follows. One considers independent random variables $(Y_{n})_{n\geq 1}$ such that $Y_{n}$ follows the beta distribution of parameters $1-\alpha$ and $\theta +n\alpha$ . Then, the Poisson-Dirichlet distribution $PD(\alpha ,\theta )$ of parameters $\alpha$ and $\theta$ is the law of the random decreasing sequence containing $Y_{1}$ and the products $Y_{n}\prod _{k=1}^{n-1}(1-Y_{k})$ . This definition is due to Jim Pitman and Marc Yor. It generalizes Kingman's law, which corresponds to the particular case $\alpha =0$ .

Number theory

Patrick Billingsley has proven the following result: if $n$ is a uniform random integer in $\{2,3,\dots ,N\}$ , if $k\geq 1$ is a fixed integer, and if $p_{1}\geq p_{2}\geq \dots \geq p_{k}$ are the $k$ largest prime divisors of $n$ (with $p_{j}$ arbitrarily defined if $n$ has less than $j$ prime factors), then the joint distribution of $(\log p_{1}/\log n,\log p_{2}/\log n,\dots ,\log p_{k}/\log n)$ converges to the law of the $k$ first elements of a $PD(0,1)$ distributed random sequence, when $N$ goes to infinity.

Random permutations and Ewens's sampling formula

The Poisson-Dirichlet distribution of parameters $\alpha =0$ and $\theta =1$ is also the limiting distribution, for $N$ going to infinity, of the sequence $(\ell _{1}/N,\ell _{2}/N,\ell _{3}/N,\dots )$ , where $\ell _{j}$ is the length of the $j^{\operatorname {th} }$ largest cycle of a uniformly distributed permutation of order $N$ . If for $\theta >0$ , one replaces the uniform distribution by the distribution $\mathbb {P} _{N,\theta }$ on ${\mathfrak {S}}_{N}$ such that $\mathbb {P} _{N,\theta }(\sigma )={\frac {\theta ^{n(\sigma )}}{\theta (\theta +1)\dots (\theta +n-1)}}$ , where $n(\sigma )$ is the number of cycles of the permutation $\sigma$ , then we get the Poisson-Dirichlet distribution of parameters $\alpha =0$ and $\theta$ . The probability distribution $\mathbb {P} _{N,\theta }$ is called Ewens's distribution, and comes from the Ewens's sampling formula, first introduced by Warren Ewens in population genetics, in order to describe the probabilities associated with counts of how many different alleles are observed a given number of times in the sample.

References

Pitman, Jim; Yor, Marc (1997). "The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator". Annals of Probability. 25 (2): 855–900. CiteSeerX 10.1.1.69.1273. doi:10.1214/aop/1024404422. MR 1434129. Zbl 0880.60076.
Bourgade, Paul. "Lois de Poisson–Dirichlet". Master thesis.
Kingman, J. F. C. (1975). "Random discrete distributions". J. Roy. Statist. Soc. Ser. B. 37: 1–22.
Billingsley, P. (1972). "On the distribution of large prime divisors". Periodica Mathematica. 2: 283–289.
Ewens, Warren (1972). "The sampling theory of selectively neutral alleles". Theoretical Population Biology. 3: 87–112.

Category:

Probability distributions