Problem 2¶

Define $h(a) = P(\tau_{\phi} < \tau_{\dagger}) | X_0=a)$

To show: $P(X_{n+1}=b|X_n=a\ and\ \tau_{\phi} < \tau_{\dagger}) = \frac{h(b)}{h(a)}P_{ab}$

LHS: $P(X_{n+1}=b|X_n=a\ and\ \tau_{\phi} < \tau_{\dagger})$

Incorrect version¶

LHS: ~~\begin{align} ~~P(X{1}=b|X_0=a\ and\ \tau{\phi} < \tau{\dagger}) &= P(X{1}=b|X0=a, \tau{\phi} < \tau_{\dagger}|X_0=a)~~\ &=P(\tau{\phi} < \tau{\dagger}|X0=a) \times P(X_1=b|X_0=a)\ \text{if $a \notin \phi,\dagger$}\ &=h(a) \times P{ab} \ &\neq RHS?? \end{align}~~

Corrected Version¶

Consider

$P(A|B,C)=P(X_{n+1}=b|X_n=a\ and\ \tau_{\phi} < \tau_{\dagger})$ Then $P(A|B,C)= \frac{P(A,B,C)}{P(B,C)}=\frac{P(A,C|B)P(B)}{P(C|B)P(B)}=\frac{P(A,C|B)}{P(C|B)} =\frac{P(A|C) \times{P(B|A,C)}}{P(B|C)}$

Thus,

$$ LHS=P(X_{n+1}=b|X_n=a\ and\ \tau_{\phi} < \tau_{\dagger}) = \frac{P(X_{n+1}=b,\tau_{\phi} < \tau_{\dagger}|X_n=a)} {P(\tau_{\phi} < \tau_{\dagger}|X_n=a)} $$

Now, $n > min(\tau_{\phi},\tau_{\dagger})$ and hence:

$$ \begin{align} P(X_{n+1}=b|X_n=a\ and\ \tau_{\phi} < \tau_{\dagger}) &= \frac{P(X_{n+1}=b,\tau_{\phi} < \tau_{\dagger}|X_n=a)} {P(\tau_{\phi} < \tau_{\dagger}|X_n=a)} \\ &= \frac{P(X_{n+1}=b|X_n=a)\times P(\tau_{\phi} < \tau_{\dagger}|X_n=a,X_{n+1}=b)}{P(\tau_{\phi} < \tau_{\dagger}|X_n=a)} \end{align} $$

Using markov property and time homogeneity: $P(\tau_{\phi} < \tau_{\dagger}|X_n=a,X_{n+1}=b)=P(\tau_{\phi} < \tau_{\dagger}|X_0=b)$ and hence:

$$ \begin{align} P(X_{n+1}=b|X_n=a\ and\ \tau_{\phi} < \tau_{\dagger}) &= \frac{P(X_{n+1}=b|X_n=a)\times P(\tau_{\phi} < \tau_{\dagger}|X_n=a,X_{n+1}=b)}{P(\tau_{\phi} < \tau_{\dagger}|X_n=a)}\\ &=\frac{h(b)\times P_{ab}}{h(a)}\\ &= RHS \end{align} $$

Problem 3¶

If initial state $X_t=A$,

$P(X_{t+1}=A) = 0.5$ and $P(X_{t+1}=A\cup\{b\}-\{a\})=0.5$

Observation 1: $X_t$ is irreducbile. The construction allows to reach every state from any state.

Example: Let $A=\{1,2,3,4,5\}$ for $n=10$ and $k=5$. let $a$=3 and let $b=6$

Then we have: $P(X_{t+1}=\{1,2,3,4,5\}) = 0.5$ and $P(X_{t+1}=\{1,2,4,5,x\})=0.5*1/5$ where $x\ \in \{6,7,8,9,10\}$

Observation 2: For $X$ to be aperioidic, it is imporatant to have the $X_{t+1}=X_t$ with probabulity 0.5(any non-zero probability would do). Otherwise the diagonal of the trasition probability matrix will be zero, and in such cases it is possible for the chain to be periodic. An example (without taking into account the actual transition probabilities) is: For state space.$\{1,2,3,4\}$ $$ P = \begin{bmatrix} 0 & 0.5 & 0 & 0.5\\ 0.5 & 0 & 0.5 & 0\\ 0 & 0.5 & 0 & 0.5\\ 0.5 & 0 & 0.5 & 0\\ \end{bmatrix} $$

and $(P^2)_{ii}>0$ It is possible to return to the same state with a period of 2: $P(X_n=2|X_0=1)= 0 \ for\ $n=2k$\ and\ 1\ for\ $n=2k-1$\ where\ k=1,2,3...$

About uniform stationary distribution¶

From observations 1,2 we know that the markv chain is irreducible and aperiodic. There is another observation:

Observation 3: $P$ the transition probabilty matrix is symmetric.

$P_{ii} = 0.5$

$P_{ij} = 0.5 * \underbrace{\frac{1}{|k|}}_\text{Probability of selecting 'i' uniformly} * \underbrace{\frac{1}{|A|-|k|}}_\text{Probability of selecting 'j' uniformly}$ $\forall j \neq i$

and hence $P_{ij} =P_{ji}$ $\implies$ $P=P^T$ $\implies$ $\pi$ is uniformly distributed (Because $P$ is reversible)

Problem 4¶

Part (4a)¶

%matplotlib inline
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1)
D = np.random.rand(100,100)
## This is not symmetric, so we make it symmetric
D = (D+D.T)/2
print (D)

[[ 0.417022    0.5234847   0.47514525 ...,  0.41077488  0.47791837
   0.52720653]
 [ 0.5234847   0.5270581   0.72129764 ...,  0.17310173  0.36455267
   0.90610878]
 [ 0.47514525  0.72129764  0.91560635 ...,  0.29658343  0.8926068
   0.43092462]
 ..., 
 [ 0.41077488  0.17310173  0.29658343 ...,  0.83527618  0.36756992
   0.06630671]
 [ 0.47791837  0.36455267  0.8926068  ...,  0.36756992  0.50837092
   0.09638852]
 [ 0.52720653  0.90610878  0.43092462 ...,  0.06630671  0.09638852
   0.76279378]]

import math
N_steps = 10000

def L(sigma):
    s=0
    for i in range(0, len(sigma)-1):
        s+=D[sigma[i], sigma[i+1]]
    return s

def propose(sigma):
    r = np.random.choice(len(sigma), 2)
    rs = np.sort(r)
    j,k=rs[0],rs[1]
    x=(sigma[j:k])#.reverse()
    x=x[::-1]
    x0=  sigma[:j]
    x1 = sigma[k:]
    y=np.concatenate((x0,x,x1))
    return y


def pi(sigma,T):
    return math.exp(-L(sigma)/T)

def metropolis(sigma,T,L_0):
    sigma_n = propose(sigma)
    L_n = L(sigma_n)
    pi_ab = math.exp(-(L_n-L_0)/T)
    q = min(1, pi_ab)
    b = np.random.uniform(size=1)
    if (b<q):
        return sigma_n
    else:
        return sigma

sigma_0 = np.random.choice(100,100)
L_0 = L(sigma_0)
print sigma_0

[80 49 18 11 67 21 88 85 81 86 52 39 52 13  9 98 78 46 26 63 86  2 96 45 13
 67 37 36 54 63 65 58 49 48 59 26  2 26 44 29 34 72 62 52 75 72 95  0 51 39
 60 24 95 80 34 36 55 31 66 80 56 23 20 56 59 27 27 89 80 34 58 74 70 81  7
 35 29 74 13 99 43 60 27 97 16 55 25 19 45 80 48 73 22 31 20 16 17 59 28 70]

T = [0.05,10]
def plotter(t):
    L_history = []
    sigma_history = []
    sigma_0 = np.random.choice(100,100)
    L_0 = L(sigma_0)
    L_history.append(L_0)
    sigma_history.append(sigma_0)
    sigma = metropolis(sigma_0,t,L_0)
    for i in range(1, N_steps):
        sigma_t = metropolis(sigma_history[i-1],t,L_history[i-1])
        L_1 = L(sigma_t)
        L_history.append(L_1)
        sigma_history.append(sigma_t)
    plt.figure(0)

    plt.hist(L_history, 20)
    #plt.xlim(min(L_history)-25, max(L_history)+0.5)
    plt.xlabel('Length')
    plt.ylabel('Frequency')
    plt.title('Frequency of L')
    plt.figure(1)

    plt.plot(range(1, N_steps+1),L_history)
    plt.ylim(min(L_history), max(L_history))
    plt.xlabel('N_steps')
    plt.ylabel('L')
    plt.title('Variation of L with N_steps')
    return L_history

T = 0.05¶

L_t0=plotter(T[0])

T=10¶

L_t1= plotter(T[1])

Correlation plots¶

from scipy.signal import correlate
def autocorr(x):
    xunbiased = x-np.mean(x)
    xnorm = np.sum(xunbiased**2)
    acor = np.correlate(xunbiased, xunbiased, "same")/xnorm
    #result = correlate(x, x, mode='full')
    #result /= result[result.argmax()]
    acor = acor[len(acor)/2:]
    return acor#result[result.size/2:]

cov_t0 = autocorr(L_t0)
cov_t1 = autocorr(L_t1)

/home/saket/anaconda/lib/python2.7/site-packages/ipykernel/__main__.py:8: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future

plt.plot(cov_t0)
plt.ylabel('Autocorrelation')
plt.xlabel('N_steps')
plt.title('Autocorrelation of L_i for T=0.05')

<matplotlib.text.Text at 0x7f7e0bb43cd0>

plt.plot(cov_t1)
plt.ylabel('Autocorrelation')
plt.xlabel('N_steps')
plt.title('Autocorrelation of L_i for T=10')

<matplotlib.text.Text at 0x7f7e09ad3c90>

Result¶

The autocorrelation seems to be high even for large values of $N_{step}$ for both the temperature values. I expected higher $T$ to yield lower autocorrelations.

Problem 1¶

Let the state space be $S = \{\phi, \alpha, \beta, \alpha+\beta, pol, \dagger\}$ Definitions:

$\tau_a = \{ n \geq 0: X_n=a\}$
$N = \sum_{k=0}^{\tau_{\phi}}I_{X_k=\dagger}$
$u(a) = E[N|X_0=a] \forall a \in S $

$u(a) = \sum_{k=0}^{\tau_{\phi}}P(X_k=\dagger|X_0=a)=\sum_{b \neq a, \dagger }P(X_1=b|X_0=a)P(X_k=\dagger|X_0=b)$ $\implies$ $u(a)=\sum_{b \neq a, \dagger} P_{ab}u(b)$

And hence $u$ solves the following set of equations:

$u=(I-P_{-})^{-1}v$ where v is (0,0,0,1) in this case. and $P_{-}$ represents the matrix with that last and first row and columns removed.

k_a=0.2
k_b=0.2
k_p=0.5
P = np.matrix([[1-k_a-k_b, k_a ,k_b, 0, 0, 0],
               [k_a, 1-k_a-k_b, 0, k_b, 0, 0],
               [k_b, 0, 1-k_a-k_b, k_a, 0, 0],
               [0, k_b, k_a, 1-k_a-k_b-k_p, k_p, 0],
               [0, 0, 0, 0, 0, 1],
               [0, 0, 0, 1, 0, 0]])

Q=P[1:5,1:5]
iq = np.eye(4)-Q
iqi = np.linalg.inv(iq)
print(iq)
print(iqi)

[[ 0.4  0.  -0.2  0. ]
 [ 0.   0.4 -0.2  0. ]
 [-0.2 -0.2  0.9 -0.5]
 [ 0.   0.   0.   1. ]]
[[ 2.85714286  0.35714286  0.71428571  0.35714286]
 [ 0.35714286  2.85714286  0.71428571  0.35714286]
 [ 0.71428571  0.71428571  1.42857143  0.71428571]
 [ 0.          0.          0.          1.        ]]

print 'U={}'.format(iqi[:,-1])
u=iqi[:,-1]

U=[[ 0.35714286]
 [ 0.35714286]
 [ 0.71428571]
 [ 1.        ]]

PP = {}
states = ['phi', 'alpha', 'beta', 'ab', 'pol', 'd']

PP['phi']= [1-k_a-k_b, k_a ,k_b, 0, 0, 0]
PP['alpha'] = [k_a, 1-k_a-k_b, 0, k_b, 0, 0]
PP['beta'] = [k_b, 0, 1-k_a-k_b, k_a, 0, 0]
PP['ab']= [0, k_b, k_a, 1-k_a-k_b-k_p, k_p, 0]
PP['pol']=  [0, 0, 0, 0, 0, 1]
PP['d']= [0, 0, 0, 1, 0, 0]
def h(x):
    s=0
    ht=0
    cc=0
    for j in range(1,100):
        new_state=x
        for i in range(1,10000):
            old_state=new_state
            probs = PP[old_state]
            z=np.random.choice(6, 1, p=probs)
            new_state = states[z[0]]
            s+=z[0]
            if new_state=='d':
                ht+=i
                cc+=1
                break
            else:
                continue

    return s/1000, ht/cc

$\alpha$¶

print('Simulation: {}\t Calculation: {}'.format(h('alpha')[1],u[0]))

Simulation: 18.2323232323	 Calculation: [[ 0.35714286]]

$\beta$¶

print('Simulation: {}\t Calculation: {}'.format(h('beta')[1],u[1]))

Simulation: 17.0	 Calculation: [[ 0.35714286]]

$\alpha+\beta$¶

print('Simulation: {}\t Calculation: {}'.format(h('ab')[1],u[2]))

Simulation: 8.92929292929	 Calculation: [[ 0.71428571]]

pol¶

print('Simulation: {}\t Calculation: {}'.format(h('pol')[1],u[3]))

Simulation: 1.0	 Calculation: [[ 1.]]

Result¶

The simulation and calculation do not agree. The simulation implementation doesn't look correct. However, looking at $\alpha$ and $\beta$ results, the simulation and calculated results seem to be in-sync.