Misplaced Pages

Poisson-Dirichlet distribution

Article snapshot taken from[REDACTED] with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Definition and first properties of the Poisson-Dirichlet distributions

In probability theory, Poisson-Dirichlet distributions are probability distributions on the set of nonnegative, non-increasing sequences with sum 1, depending on two parameters α [ 0 , 1 ) {\displaystyle \alpha \in [0,1)} and θ ( α , ) {\displaystyle \theta \in (-\alpha ,\infty )} . It can be defined as follows. One considers independent random variables ( Y n ) n 1 {\displaystyle (Y_{n})_{n\geq 1}} such that Y n {\displaystyle Y_{n}} follows the beta distribution of parameters 1 α {\displaystyle 1-\alpha } and θ + n α {\displaystyle \theta +n\alpha } . Then, the Poisson-Dirichlet distribution P D ( α , θ ) {\displaystyle PD(\alpha ,\theta )} of parameters α {\displaystyle \alpha } and θ {\displaystyle \theta } is the law of the random decreasing sequence containing Y 1 {\displaystyle Y_{1}} and the products Y n k = 1 n 1 ( 1 Y k ) {\displaystyle Y_{n}\prod _{k=1}^{n-1}(1-Y_{k})} . This definition is due to Jim Pitman and Marc Yor. It generalizes Kingman's law, which corresponds to the particular case α = 0 {\displaystyle \alpha =0} .

Number theory

Patrick Billingsley has proven the following result: if n {\displaystyle n} is a uniform random integer in { 2 , 3 , , N } {\displaystyle \{2,3,\dots ,N\}} , if k 1 {\displaystyle k\geq 1} is a fixed integer, and if p 1 p 2 p k {\displaystyle p_{1}\geq p_{2}\geq \dots \geq p_{k}} are the k {\displaystyle k} largest prime divisors of n {\displaystyle n} (with p j {\displaystyle p_{j}} arbitrarily defined if n {\displaystyle n} has less than j {\displaystyle j} prime factors), then the joint distribution of ( log p 1 / log n , log p 2 / log n , , log p k / log n ) {\displaystyle (\log p_{1}/\log n,\log p_{2}/\log n,\dots ,\log p_{k}/\log n)} converges to the law of the k {\displaystyle k} first elements of a P D ( 0 , 1 ) {\displaystyle PD(0,1)} distributed random sequence, when N {\displaystyle N} goes to infinity.

Random permutations and Ewens's sampling formula

The Poisson-Dirichlet distribution of parameters α = 0 {\displaystyle \alpha =0} and θ = 1 {\displaystyle \theta =1} is also the limiting distribution, for N {\displaystyle N} going to infinity, of the sequence ( 1 / N , 2 / N , 3 / N , ) {\displaystyle (\ell _{1}/N,\ell _{2}/N,\ell _{3}/N,\dots )} , where j {\displaystyle \ell _{j}} is the length of the j th {\displaystyle j^{\operatorname {th} }} largest cycle of a uniformly distributed permutation of order N {\displaystyle N} . If for θ > 0 {\displaystyle \theta >0} , one replaces the uniform distribution by the distribution P N , θ {\displaystyle \mathbb {P} _{N,\theta }} on S N {\displaystyle {\mathfrak {S}}_{N}} such that P N , θ ( σ ) = θ n ( σ ) θ ( θ + 1 ) ( θ + n 1 ) {\displaystyle \mathbb {P} _{N,\theta }(\sigma )={\frac {\theta ^{n(\sigma )}}{\theta (\theta +1)\dots (\theta +n-1)}}} , where n ( σ ) {\displaystyle n(\sigma )} is the number of cycles of the permutation σ {\displaystyle \sigma } , then we get the Poisson-Dirichlet distribution of parameters α = 0 {\displaystyle \alpha =0} and θ {\displaystyle \theta } . The probability distribution P N , θ {\displaystyle \mathbb {P} _{N,\theta }} is called Ewens's distribution, and comes from the Ewens's sampling formula, first introduced by Warren Ewens in population genetics, in order to describe the probabilities associated with counts of how many different alleles are observed a given number of times in the sample.

References

  1. Pitman, Jim; Yor, Marc (1997). "The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator". Annals of Probability. 25 (2): 855–900. CiteSeerX 10.1.1.69.1273. doi:10.1214/aop/1024404422. MR 1434129. Zbl 0880.60076.
  2. Bourgade, Paul. "Lois de Poisson–Dirichlet". Master thesis.
  3. Kingman, J. F. C. (1975). "Random discrete distributions". J. Roy. Statist. Soc. Ser. B. 37: 1–22.
  4. Billingsley, P. (1972). "On the distribution of large prime divisors". Periodica Mathematica. 2: 283–289.
  5. Ewens, Warren (1972). "The sampling theory of selectively neutral alleles". Theoretical Population Biology. 3: 87–112.
Category:
Poisson-Dirichlet distribution Add topic