In sphinx4 and Sphincs, the acoustic model consists of a set of left−to−right Hidden Markov Models (HMMs), with one HMM per unit. The units typically represent phones in a triphone context. The following diagram illustrates the definition of the HMMs of the acoustic models in sphinx4:
AM.gif
In the drawing, any object shown as a "stack of cards" represents a shared pool of object instances. For example, there is a shared pool of Senones that are referred to by SenoneSequences. The set of shared pools allows the sphinx4 HMMs to support concepts known as "state tying" and "parameter tying." With state and parameter tying, the HMMs can share a large variety of features. There are at least two reasons for doing
tying: the primary reason is to get sufficiently trained models, and the second reason is to help reduce the number of calculations during the search.
Each state of the HMMs in sphinx4 are called "Senones." The Senones are based on probability density functions (pdfs). As shown in the illustration, the pdfs are continuous Gaussian Mixtures. The exact type of pdf, however, does not have to be a Gaussian Mixture. Instead, the pdf merely needs to be able to take a Feature from the front end and return a score. As illustrated by the MixtureComponent, each GaussianMixture

Last edited Oct 12, 2014 at 5:17 PM by DxN, version 3