Testing

Now let's see how good our recognizer is. For testing we need most of the objects that are also needed for training. So we can use basically the same startup procedure as before. We can even omit the creation of the path and the HMM object. Therefore we need some more objects for the search, plus we need a vocabulary file and a language model. The vocabulary file we created together with the dictionary in step1.

The language model file is a bit more complicated. We will not discuss every line of a script that produces a language model for a given database. You can use the ready-to-run script from the scripts thread. The script there is rather simple, it uses an array lm where lm(cnt) is the total count of all words, lm(1,v) contains the occurrence frequency of word v, and lm(2,v,w) contains the bigram frequency of the bigram v w. After the lmUpdate procedure has been called, l(p,1,v) contains the unigram probability of word v, and l(p,2,v,w) contains the bigram probability of the bigram v w.

You can find the complete test script in the script thread. On this page we will explain each step of it.

After you have created a vocabulary and a language model file, let's now build the search objects:

sns setscoreFct base 

PHMMSet phmms tpt [tpt.item(0) configure -name] 
LCMSet lcms phmms 
RCMSet rcms phmms 

SVocab voc dict 
[LingKS lm NGramLM] load langmod2 
SVMap svmap voc lm 

voc read vocab 
svmap map base 
voc:\$ configure -fTag 1 
voc:SIL configure -fTag 1 

svmap configure -unkString +UNINTELLIGIBLE+ 
svmap configure -lz 16 -wordPen -0 -filPen 0 

STree stree svmap lcms rcms 
LTree ltree stree 
SPass spass stree ltree 
spass configure -morphBeam 30 -stateBeam 50 -wordBeam 35 

Here we set up all the necessary objects and do the necessary configurations for the decoder. First we tell the senone set that it should use a scoring functions which evaluates the complete Gaussian Mixture for every model. PHMMSet, LCMSet, and RCMSet are objects needed by the decoder to calculate the scores for the models during decoding. SVocab and SVMap define the search vocabulary and the mapping of the words in the search vocabulary onto words in the language model.

The linguistic knowledge source object lm is specified to be a statistical n-gram language model based on the local file "langmod". The -lz option of the svmap defines the weighting of the language model vs. the acoustic model. The higher this value is the more emphasis we are putting on the language model. For now you will just have to trust, that the value 16 is some good value. The same is true for the -wordPen and the -filPen option.

Finally we can create the search objects, which are based on the previously created objects svmap, lcms, rcms. The configure command is needed to set a parameter of the search object to some reasonable value, without which the search wouldn't run. There are many other parameters, but for simplicity reasons, we will ignore them for the time being and just leave them with their default values.

Now we can start defining our search procedure:

proc testOne { utt } {
     set uttinfo [db get \$utt] 
     makeArray infoArray \$uttinfo 
     fs eval \$uttinfo 
     spass run 
     set hypo [spass.stab trace] 
     puts "\$utt: \$hypo" 
} 

The \$uttinfo variable is obtained in the same way as we did in the training scripts. The actual trigger for the decoding is the run method of the spass object. This makes the search perform the search pass. The puts command will display the recognized hypothesis.

To test on the entire testset we'd run the following loop:

set fp [open ../step1/testIDs r] 
while { [gets \$fp utt] != -1 } {
     puts "testing utterance \$utt" 
     testOne \$utt } 
     close \$fp 
}
close \$fp