Ktl-icon-tai-lieu

Dirichlet distribution (course Emilion)

Được đăng lên bởi bocau
Số trang: 95 trang   |   Lượt xem: 6729 lần   |   Lượt tải: 0 lần
Chapter 1
Dirichlet distribution
The Dirichlet distribution is intensively used in various fields: biology ([11]), astronomy
([24]), text mining ([4]), ...
It can be seen as a random distribution on a finite set. Dirichlet distribution is a very
popular prior in Bayesian statistics because the posterior distribution is also a Dirichlet
distribution. In this chapter we give a complete presentation of this interesting law: representation by Gamma’s distribution, limit distribution in a contamination model. (The
Polya urn scheme), ...

1.1

Random probability vectors

Consider a partition of a nonvoid finite set E with cardinality ♯E = n ∈ N∗ into d nonvoid disjoint subsets. To such a partition corresponds a partition of the integer n, say
c1 , . . . , cd , that is a finite family of positive integers, such that c1 + . . . + cd = n. Thus,
if pj =

cj
,
n

we have p1 + . . . + pd = 1.

In biology for example, pj can represent the percentage of the j th specy in a population.

1

CHAPTER 1. DIRICHLET DISTRIBUTION

2

So we are lead to introduce the following d-dimentional simplex:
d

△d−1 = {(p1 , . . . , pd ) : pj ≥ 0,

pj = 1}.
j=1

When n tends to infinity, this yields to the following notion:
Definition 1.1.1. One calls mass-partition any infinite numerical sequence
p = (p1 , p2 , . . .)
such that p1 ≥ p2 ≥ . . . and

∞
1

pj = 1.

The space of mass-partitions is denoted by
∞

∇∞ = {(p1 , p2 , . . .) : p1 ≥ p2 ≥ . . . ; pj ≥ 0, j ≥ 1,

pj = 1}.
j=1

Lemma 1.1.1. (Bertoin [28] page 63) Let x1 , . . . , xd−1 be d − 1 i.i.d. random variables
uniformly distributed on [0, 1] and let x(1) < . . . < x(d−1) denote its order statistic, then
the random vector
(x(1) , . . . , x(d−1) − x(d−2) , 1 − x(d−1) )
is uniformly distributed on △d−1 .

1.2

Polya urn (Blackwell and MacQueen ) [3]

We consider an urn that contains d colored balls numbered from 1 to d. Initially, there is
only one ball of each color in the urn. We draw a ball, we observe its color and we put
it back in the urn with another ball having the same color. Thus at the instant n we have
n + d balls in the urn and we have added n = N1 + . . . + Nd balls with Nj balls of color
j.
We are going to show that the distribution of ( N1 ,
n
distribution.

N2
, . . . , Nd )
n
n

converges to a limit

1.2. POLYA URN (BLACKWELL AND MACQUEEN ) [3]

3

1.2.1 Markov chain
Proposition 1.2.1.
lim (

n−→∞

Nd d
N1
,...,
) = (Z1 , Z2 , . . . , Zd )
n
n

where (Z1 , Z2 , . . . , Zd ) have a uniform distribution on the simplex △d...
Chapter 1
Dirichlet distribution
The Dirichlet distribution is intensively used in various fields: biology ([11]), astronomy
([24]), text mining ([4]), ...
It can be seen as a random distribution on a finite set. Dirichlet distribution is a very
popular prior in Bayesian statistics because the posterior distribution is also a Dirichlet
distribution. In this chapter we give a complete presentation of this interesting law: rep-
resentation by Gamma’s distribution, limit distribution in a contamination model. (The
Polya urn scheme), ...
1.1 Random probability vectors
Consider a partition of a nonvoid finite set E with cardinality ♯E = n N
into d non-
void disjoint subsets. To such a partition corresponds a partition of the integer n, say
c
1
, . . . , c
d
, that is a nite family of positive integers, such that c
1
+ . . . + c
d
= n. Thus,
if p
j
=
c
j
n
, we have p
1
+ . . . + p
d
= 1.
In biology for example, p
j
can represent the percentage of the j
th
specy in a population.
1
Dirichlet distribution (course Emilion) - Trang 2
Để xem tài liệu đầy đủ. Xin vui lòng
Dirichlet distribution (course Emilion) - Người đăng: bocau
5 Tài liệu rất hay! Được đăng lên bởi - 1 giờ trước Đúng là cái mình đang tìm. Rất hay và bổ ích. Cảm ơn bạn!
95 Vietnamese
Dirichlet distribution (course Emilion) 9 10 280