标签:
基于单词的模型训练方法:1)定义的phoneset改为单词列表 2)词典中的map,是单词指向它本身
而如果训练基于音素的,要保证每个连接状态有充足的例子-5~10个的例子
半连续 semi-continue模型的训练----训练 :每个hmm模型要求5个状态,对10000个triphone的模型,要求如下:
5 states/triphone = 50,000 states每个triphone是5个状态,总共5w个状态 For a 4-stream feature-set, each = 1024 floating point numbers/state 每个状态是4*256个点数。4是特征个数,256是混合权重个数 state has a total of 4*256 mixture weights = 205Mb buffer for 50,000 states
连续模型 continue模型的训练---对于10000个triphone
5 states/triphone = 50,000 states 每个triphone5个状态,总共5w个状态 39 means (assuming a 39-component feature vector) and 39 variances per state = 79 floating points per state 39个均值(假设是个39维度的特征向量),每个状态有39个方差 = 15.8Mb buffer for 50,000 states
1 1.initialize() 2 2. createInitialLists() 3 3. recognize(nframes) 4 4.recognize():
标签:
原文地址:http://www.cnblogs.com/lijieqiong/p/5151709.html