hidden markov model (decoding)

application:
語音識別
分詞
1、隐藏状态 (天气):Sunny,Cloudy,Rainy;
2、观察状态(海藻湿度):Dry,Dryish,Damp,Soggy;
3、初始状态概率: Sunny(0.63), Cloudy(0.17), Rainy(0.20);
4、状态转移矩阵:
today weather
sunny
cloudy
rainy
yesterday weather
sunny
0.5
0.375
0.125
cloudy
0.25
0.125
0.625
rainy
0.25
0.375
0.375
5、混淆矩阵:
observed states
dry
dryish
damp
soggy
hidden states
sunny
0.6
0.2
0.15
0.05
cloudy
0.25
0.25
0.25
0.25
rainy
0.05
0.1
0.35
0.5

Finding most probable sequence of hidden states
brute force
p(hidden state|(dry,damp,soggy)) = max(p(dry|sunny,damp|sunny,soggy|sunny),p(dry|sunny,damp|sunny,soggy|cloudy),p(dry|sunny,damp|sunny,soggy|rainy)…… )

viterbi (dynamic programming)
calculate the partial best path based on each phase,  and record the partial best path we walk through
phase 1(dry):
p(sunny) = initial hidden state * p(dry|sunny)=0.63 *0.6 = 0.378
p(rainy) =  initial hidden state * p(dry|rainy) = 0.17 * 0.05 = 0.0425
p(cloudy) = initial hidden state * p(dry|cloudy) = 0.2 * 0.25 =0.01
phase 2(damp):
p(sunny) = max(p(previous phase sunny)*p(sunny|sunny),p(previous phase rainy)*p(sunny|rainy),p(previous phase cloudy)*p(sunny|cloudy)) *p(damp|sunny)=max(0.378*0.5,0.0425*0.25,0.01*0.25)*0.15=0.378*0.5*0.15 = 0.02835   prev-state : sunny
phase 3(soggy):
same as phase 2

hidden markov model (evaluation)

application:
檢測已知模型是否準確
1、隐藏状态 (天气):Sunny,Cloudy,Rainy;
2、观察状态(海藻湿度):Dry,Dryish,Damp,Soggy;
3、初始状态概率: Sunny(0.63), Cloudy(0.17), Rainy(0.20);
4、状态转移矩阵:
today weather
sunny
cloudy
rainy
yesterday weather
sunny
0.5
0.375
0.125
cloudy
0.25
0.125
0.625
rainy
0.25
0.375
0.375
5、混淆矩阵:
observed states
dry
dryish
damp
soggy
hidden states
sunny
0.6
0.2
0.15
0.05
cloudy
0.25
0.25
0.25
0.25
rainy
0.05
0.1
0.35
0.5

calculate the probability of observed state(dry,damp,soggy) based on the hidden model pattern we using

sunny
sunny
sunny
cloudy
cloudy
cloudy
rainy
rainy
rainy
dry
damp
soggy
1.brute force
p((dry,damp,soggy)|hmm) = p(dry|sunny,damp|sunny,soggy|sunny)+p(dry|sunny,damp|sunny,soggy|cloudy)+p(dry|sunny,damp|sunny,soggy|rainy)+….
2.forward (dynamic programming)
calculate probabilities in  each phase based on hidden states transition, sum up as a new hidden state probability
phase 1:
p(sunny) = initial hidden state * p(dry|sunny)=0.63 *0.6 = 0.378
p(rainy) =  initial hidden state * p(dry|rainy) = 0.17 * 0.05 = 0.0425
p(cloudy) = initial hidden state * p(dry|cloudy) = 0.2 * 0.25 =0.01
phase 2:
p(sunny) = (p(previous phase sunny) * p(sunny|sunny)+p(previous  phase rainy) * p(sunny|rainy)+p(previous phase cloudy) * p(sunny|cloudy) ) * p(damp|sunny) = (0.378*0.5+0.0425*0.25+0.01*0.25)*0.15 = 0.30319
phase 3:
same calculation as phase 2

jieba

cut(cut_all=false,HMM=true)
1. sentence (split by space or punctuation)
2.load dictionary
3.generate directed acyclic graph
0: [0] 1: [1,2,4] 2: [2] 3: [3,4] 4: [4] 5: [5]

4.find the max weight path bases on word frequency(iterate DAG and find the max weight path of each node)
5.if not a single word,return the word. otherwise store into buffer and finalseg (HMM)

Stanford Word Segmenter

1.download from here : https://nlp.stanford.edu/software/segmenter.shtml

2.place it to nltk tokenize folder.

3.also add a word segmenter interface (already attached in the latest version)
http://www.nltk.org/_modules/nltk/tokenize/stanford_segmenter.html

result:

位於 意大利 西西里島 , 同時 是 歐洲 最高 活火山 的 埃特納 火山 ( M o u n t E t n a ) 突然 爆發 , 波及 英國廣播公司 B B C 的 團隊 及 一些 遊客 。 他們 全部 及 時獲 拯救 隊送離 現場,無 人命 傷亡 , 但 有 大約 8 人 受傷 。 目擊者 形容 , 當時 人人 都 在逃 跑 , 但 蒸氣 令 人 眼前 一 片 白霧 , 根本 無法 望到 前方 , 幸好 有 車輛 接 載他們 離開 現場

hidden markov model

reference: http://www.52nlp.cn/category/hidden-markov-model

Nothing lasts forever,except for change. people are interested in  figuring out the pattern between changes.

what’s a pattern?

  1. deterministic  patterns (like traffic lights. everything is for sure,one state follow by the other state.easy to predict)
    traffic-lights_anim
  2. non-deterministic patterns(not for sure,conditional probability involves)
    weather-example-anim
    2.1 first order markov process
    two basic assumption
    1.齐次马尔科夫性假设,即假设隐藏的马尔科夫链在任意时刻t的状态只依赖于其前一时刻的状态,与其它时刻的状态及观测无关,也与时刻t无关;
    P(states[t] | states[t-1],observed[t-1],…,states[1],observed[1]) = P(states[t] | states[t-1]) t = 1,2,…,T

2.观测独立性假设,即假设任意时刻的观测只依赖于该时刻的马尔科夫链的状态,与其它观测和状态无关,
P(observed[t] | states[T],observed[T],…,states[1],observed[1]) = P(observed[t] | states[t]) t = 1,2,…,T

state transition matrix  (how state transit)weather-matrix
when we have a pattern to describe changes,we still need a initial state.pi vector  [(sun,1.0),(cloud,0.0),(rain,0.0)]

2.2 hidden markov model
sometimes the pattern we want is not descried sufficiently,we want to know more about hidden states by observing the observed states.(observable states ===>hidden states)

so we need:
#observed states
#hidden states
#pi vector(initial hidden states)
#state transition matrix  (the probability of the current states given the previous states)
#confusion matrix (the probability of the observed states given the hidden state)
for example,
1、隐藏状态 (天气):Sunny,Cloudy,Rainy;
2、观察状态(海藻湿度):Dry,Dryish,Damp,Soggy;
3、初始状态概率: Sunny(0.63), Cloudy(0.17), Rainy(0.20);
4、状态转移矩阵:
today weather
sunny
cloudy
rainy
yesterday weather
sunny
0.5
0.375
0.125
cloudy
0.25
0.125
0.625
rainy
0.25
0.375
0.375
5、混淆矩阵:
observed states
dry
dryish
damp
soggy
hidden states
sunny
0.6
0.2
0.15
0.05
cloudy
0.25
0.25
0.25
0.25
rainy
0.05
0.1
0.35
0.5

Application

1.evaluation
使用前向算法(forward algorithm)来计算给定隐马尔科夫模型(HMM)后的一个观察序列的概率,并因此选择最合适的隐马尔科夫模型(HMM) ====>  给定模型λ=(A,B,π) 和 观测序列 O=(o1,o2,…,oT) ,计算在模型λ 下观测序列O出现的概率P(O|λ)

2.decoding
已知模型λ=(A,B,π) 和观测序列O=(o1,o2,…,oT) ,求对给定观测序列条件概率P(S|O) 最大的状态序列I=(s1,s2,…,sT) ,即给定观测序列。求最有可能的对应的状态序列  Viterbi算法
3.learning
根据一个观察序列(来自于已知的集合),以及与其有关的一个隐藏状态集,估计一个最合适的隐马尔科夫模型(HMM),也就是确定对已知序列描述的最合适的(pi,A,B)三元组

mmseg

just for concept reveal

1.find all the three-word chunk with maximum length

2.four ambiguity resolution rules

reference: http://technology.chtsai.org/mmseg/

a word puzzle

awesome!!
target