Hidden Markov Model(1: Joint, Marginal Probability of HMM)

기계학습/인공지능및기계학습개론정리

Hidden Markov Model(1: Joint, Marginal Probability of HMM)

H_erb Salt 2020. 11. 16. 10:58

Main Questions on HMM

- Given the topology of the bayseian network, HMM, or M

$\pi$는 initial state, latent state를 정의할때 쓰이는 parameter

a는 어느 state에서 다음 state로 transitional 할 때의 probability

b는 어떤 특정 state에서 observation이 generated 되서 나올 probability

X는 우리가 가지고 있는 관측값

- 1. Evaluation question

- Given $\pi, a, b ,X$

- Find $P(X|M, \pi, a, b)$

- how much is X likely to be observed in the trained model?

(HMM의 구조를 가지고 X가 generated 되었다고 가정한 경우 얼마나 likely한 observation 인지?)

-> 모델과 파라미터들이 주어졌을 때, 실제로 관찰 변수 X로 관찰될 확률

- 2. Decoding question

- Given $\pi, a, b ,X$

- Find $argmax_zP(Z|M, \pi, a, b)$

- What would be the most probable sequences of latent states?

(HMM의 structrue를 가정하고, 데이터셋의 observation과 모든 parameter를 다 아는 경우 보이는 데이터를 가장 잘 설명할 수 있는 latent vector들의 흐름을 알아내야 함)

- 3. Learning question

- Given X

- Find $argmax_{\pi, a, b}P(X|M, \pi, a, b)$

- What would be the underlying parameters of the HMM given the obsevations?

(파라미터를 모르는 상태에서 주어진 데이터가 보일 확률을 가장 maximize하는 $\pi, a, b$를 찾아내야 함)

- Decoding questions and learning questions are veryy similar to

- Supervised and unsupervised learning

- Anyway, we often need to find $\pi, a, b$ prior to the supervised learning with X

Example

- 강원랜드에서 주사위 던지기를 하는데 조작된 주사위(loaded dice, L)와 일반 주사위(fair dice, F)를 던짐

- $P(L → L) = P(Z_{t+1}=L|Z_t=L) => a$

- $P(X_t=1|Z_t=L) => b$ (mle를 쓰든, map를 쓰든)

- $\pi, a, b$를 X와 M이 주어진 상황에서 구할 수 있음

Joint Probability

Let's assume that we have a training dataset X and Z

- hidden markov 모델도 훌륭한 baysian network 모델

- Can we comput the joint probability, P(X, Z)

Yes, Easily by the virtue of the network structure

- Anyway, a Bayesian network, so

Factorize
$P(X, Z)=P(x_1,\cdots x_t, z_1, \cdots, z_t)$
$=P(z_1)P(x_1|z_1)P(z_2|z_1)P(x_2|z_2)P(z_3|z_2)P(x_3|z_3)$
Nothing but a combination of initial, transition, and emission probabilities
$= \pi _{idx(z_1=1)}b_{idx(x_1=1),idx(z1=1)},a_{idx(z_1=1),idx(z_2=1)} \cdots$

Marginal Probability

- Eventually, we only want to use X and marginalize Z

- In HMM, $P(X|\pi, a, b) = \sum_ZP(X,Z|\pi,a,b)$

- Need to avoid a repretitive computing

compute only necessary terms for a single time
Let's work on the fomula $P(A,B,C)=P(A)P(B|A)P(C|A,B)$ (아무런 조건이 없어도 사용 가능)

- now, we see a repeating structure of terms

- $P(x_1, \cdots, x_t, z_t^k =1)= \alpha_t^k=b_{k,x_t}\sum_i \alpha_{t-1}^ia_{i,k}$

'기계학습 > 인공지능및기계학습개론정리' 카테고리의 다른 글

Sampling Based Inference(Forward/Rejection/Importance Sampling) (0)	2020.11.26
Hidden Markov Model (2. For-Backward Prob. Calculation/ Viterbi Decoding Algorithm) (0)	2020.11.19
Bayesian Network (0)	2020.11.03
Naive Bayes Classifier + Logistic Regression (parameter approximation) (0)	2020.08.27
MLE(Maximum Likelihood Estimator) vs MAP(Maximum A Posterior) (0)	2020.08.27

현재글Hidden Markov Model(1: Joint, Marginal Probability of HMM)

데분데싸