In the previous sections, we use a syllable as an acoustic model. In this section, we shall decompose a syllable into phones and use each phone as an acoustic model. These phones are called monophones since they are independent of their following phones. If we have more training data and want to distinguish phone models in a more detailed manner, we can use the so-called biphones which is right-context dependent (RCD for short).
The new pam file for using monophone acoustic models is digitMonophone.pam, as shown next:
In fact, we only need to replace digitSyl.pam with digitMonophone.pam, then we can proceed with all the same training and test procedures covered in the previous sections to get the results, as shown in the following example:
The generated list of monophones are shown next:
The corresponding mlf file are shown next:
The following examples uses 26-dimensional MFCC_E_D_Z:
The following examples uses 39-dimensional MFCC_E_D_A_Z: