Sampling Based Inference (Markov Chain) -작성 중

기계학습/인공지능및기계학습개론정리

H_erb Salt 2020. 12. 23. 14:12

- Learn sampling based inference

-> 머신러닝 계열에서 inference할 때 가장 많이 쓰이는 것이 Gibbs sampling (Metropolis-Hastings Algorithm의 special case)

- Detour: EM Algorithm

(EM 알고리즘이 강의 중에 참 여러번 나오고 언급된다. 강화학습 공부할 때에도 거의 유사한 방식으로 학습시키는 게 많다보니 중요한 개념이 확실한 것 같다)

Finds the maximum likelihood solutions for models with latent variables
$P(X|\theta) = \sum_Z P(X,Z| \theta)$ -> $ln P(X|\theta) = ln[\sum_Z P(X,Z|\theta)]$
Initialize \theta ^0 to an arbitrary point
Loop until the likelihood converges
- Expectation step
  - $q^{t+1}(z) = argmax_q Q(\theta^t, q)=argmax_q L(\theta^t,q) = argmin_q KL(q||P(Z|X,\theta^t))
  - -> $q^t(z) = P(Z|X,\theta)$ -> Assign Z by $P(Z|X,\theta)$ (예전엔 optimize 해서 assign 했는데 이걸 샘플링 기반으로 해보자)
- Maximization step
  - $\theta^{t+1} = argmax_{\theta}Q(\theta, q^{t+1}) = argmax_{\theta}L(\theta, q^{t+1})$
  - -> fixed Z means that there is no unobserved variables
  - Same optimization of ordinary MLE

- Markov Chain

Each node has a probability distribution of states
- i.e.) The probability that a state is the current state of a system
  - Concrete observation of a system: [1 0 0] -> the system is at the first state
  - Stochastic observation of a system: [0.7 0.2 0.1] -> the system is likely at the first state
- The node has a vector of state probability distribution
Each link suggests a probabilistic state transition
- If system is at the first state, the probability distribution of the next state is [0.3, 0.4, 0.3]
- The link has a matrix of state transition probability distribution

- Properties of Markov Chain

(특성들의 일부를 MCMC 과정에서 이용)

Sampling Based Inference(Forward/Rejection/Importance Sampling) (0)	2020.11.26
Hidden Markov Model (2. For-Backward Prob. Calculation/ Viterbi Decoding Algorithm) (0)	2020.11.19
Hidden Markov Model(1: Joint, Marginal Probability of HMM) (0)	2020.11.16
Bayesian Network (0)	2020.11.03
Naive Bayes Classifier + Logistic Regression (parameter approximation) (0)	2020.08.27

데분데싸