What is the significance of the stationary distrib

2019-09-12 15:10发布

问题:

Let X_n be a MC, P not regular

Say we have a stationary dist (pi_0, ..., pi_n) and P(X_0 = i) = 0.2, does this say anything?

To be more clear:

I ask because Karlin says when a stationary dist is not a limiting dist, P(X_n = i) is dependent on the initial distribution. What does this exactly mean?

回答1:

Your title's question requires a lengthy answer; I'd have to just provide some references for you to read more on Markov chains and ergodic theory. However, your specific question:

"...when a stationary dist is not a limiting dist, P(X_n = i) is dependent on the initial distribution. What does this exactly mean?"

can be answered with a simple example. Consider a Markov chain with two states,

Suppose a transition matrix P =

[0.4, 0.6]

[0.6, 0.4]

If I told you that you are currently in state A at time t, and then asked what state you will be in next (at time t+1), you would left multiply P by [1, 0] and interpret the result [0.4, 0.6] as meaning you're 40% sure that you'll still be in state A and 60% sure you'll end up in state B.

Now what if I told you that you were in state A at time t and asked what state you'll be in at time t+999? In all that time, there is so much randomness in how you have bounced around the states that you really wouldn't be able to "follow the chain closely" from the fact that you started at state A. Basically, that information was "mixed around" by the Markov chain until it didn't really matter that you started at A. Just ask yourself: how would your opinion on your state at time t+999 change if I told you that you started in state B rather than state A? You aren't able to conjure up a difference in opinion; that is the invariance.

Mathematically, the transition matrix for time t+999 given the state at time t is P^(999). Every row of this matrix will be identical, so left multiplying by any probability distribution ([x, y] where x+y=1) will result in the same answer. For this problem, this "limiting" distribution is [0.5, 0.5], meaning that after 999 timesteps, you would be 50% sure you're in state A and 50% sure you're in state B regardless of the fact that 999 timesteps ago you started in A. The stationary distribution is the left eigenvector of P corresponding to an eigenvalue of 1. It is called the limiting distribution of P if all the rows of P^t converge to it as t -> inf.

What about a Markov chain that doesn't have a limiting distribution? Consider a P of,

[0, 1]

[1, 0]

This Markov chain has a "flip flopping" periodic nature. If at time t you are in state A, you are 100% sure you'll be in state B at time t+1, and vis-versa. So if I told you that you are in state A at time t, you would know that at time t+999 you will be in state B since 999 is odd. Alternatively, if I told you that you are in state B at time t, then at time t+999 you would expect to be in state A. Notice the dependence on initial condition. This Markov chain is sensitive to that starting information. It doesn't "mix away". Mathematically, P^t does not converge as t -> inf.

Try playing with these matrices in code!