As in the Dirichlet clustering, the dirichlet process can be represented by the following:
- Chinese Restaurant Process
- Stick Breaking Process
- Poly Urn Model
For instance, if we consider Chinese Restaurant Process
the process is as follows:
- Initially the restaurant is empty
- The first person to enter (Alice) sits down at a table (selects a group).
- The second person to enter (Bob) sits down at a table.
- Which table does he sit at?
- He sits down at a new table with probability
α/(1+α)
- He sits with at existing table with Alice (mean he'll join existing group)
with probability
1/(1+α)
- The (n+1)-st person sits down at a new table with probability
α/(n+α)α/(n+α)
, and at table k with probabilitynk/(n+α)nk/(n+α)
, wherenk
is the number of people currently sitting at table k.
The question is:
Initially, the first person will join, say G1 (i.e. group 1),
Now the second person will join
new group = G2 with probability α/(1+α) = P(N)
existing group = G1 with probability 1/(1+α) = P(E)
Now if I calculate the probabilities for new entry, I'll have values for both i.e. P(N)
and P(E)
. Then,
- How will I decide that new entry will join which group G1 or G2?
- Would it be decided on basis of values of both probabilities?
As,
If (P(N) > P(E))
then
_new entry_ will join G2
AND
If (P(E) > P(N))
then
_new entry_ will join G1