Computing a first LDA

We will not discuss on this page what an LDA is good for and how it works. This is part of the theory and the discussion thread. Here we will only talk about how to do it with Janus. The full script for this step is here

Start up a system as you did in the previous two steps. Then create an LDA object:

LDA lda fs FEAT+ 143 
foreach ds [dss:] { lda add \$ds ; lda map [dss index \$ds] -class \$ds }

You have no established an LDA object named lda which uses the feature FEAT+ from the feature set fs. You have to provide the dimensionality of the feature for two reasons, one is this feature does not necessarily have to be already known to the feature set, and the other reason is to offer some security by avoiding mixups of incompatible features.

After the LDA object is created, we have to tell it (using the add method) the classes that are to be discriminated. In our case we choose each codebook to be a class. Since at the moment we have exactly the same distributions as codebooks, we can as well use the distribution names as classes. This has another advantage which will be explained later. The lda map method tells the lda object which acoustic model index that it will find in the path structure during training belongs to which class. As we did it here, we have assigned every distribution's index to belong to the class of its distribution.

The training (accumulation of training data) for the LDA object is a loop over all training utterances:

foreach utt [db] { 
    puts \$utt 
    set uttInfo [db get \$utt] 
    makeArray arr \$uttInfo 
    fs eval \$uttInfo 
    hmm make \$arr(TEXT) -optWord SIL 
    path bload ../step4/labels/\$utt 
    path map hmm -senoneSet sns -stream 0 
    lda accu path 
}

This loop should remind you in the way you wrote the label files. It's body is similar to the Tcl Viterbi procedure. The most prominent differences are that we are not calling the viterbi method. Instead, we are loading the already computed paths from the labels files with the bload (binary-load) method of the path object. The path map command will fill the path's contents with the distribution indices as they would be used by the current HMM object. This is necessary, because Janus does not rely on indices of distributions or senones etc. to remain the same for different systems or environments. Therefore the stored path does not necessarily contain the distribution indices that we are using now. After the map method is finished, we can say, that if the path objects states that at frame f the distribution d was aligned, then we can be sure that this is the one that our current distribution set is running under this index. Now we can also se, why it was an advantage to use distributions as classes for the LDA object. If we had used codebooks, then we'd have to do yet another mapping indirection.

Finally, the third prominent difference to the Viterbi alignment is the additional command lda accu path, which gives the just loaded path object to the lda object and tells the lda object to extract all training data it needs from it.

When the loop is finished, we tell the lda object to update its parameters:

lda update

This is like telling the codebook set to update its vectors according to its accumulators. After the update is finished the lda object offers two subobjects, a total scatter matrix lda.matrixT, and a within class scatter matrix lda.matrixW. The simdiag command can compute an eigenvalue matrix and an LDA transformation matrix out of them:

DMatrix A 
DMatrix K 
A simdiag K lda.matrixT lda.matrixW 
[FMatrix B] DMatrix A

The last line of the above is simple a conversion from a matrix that uses double precision floating point number to one that uses single precision numbers. Now we can save the LDA matrix into a file.

B bsave ldaMatrix

Some Plausibility Checks

You can have a look at the eigenvalue matrix K, by just typing its name. You should see that only the diagonal items have values significantly different from zero, also you should see that the eigenvalues are decreasing, i.e. that K(i,i) > K(j,j) for i > j. If this is not the case then something must have gone wrong.

When the LDA matrix is saved in a file, we should also save some other useful information. The LDA objects also stores the occurrence counts for each class:

set fp [open ldaCounts w] 
foreach cl [lda:] { puts \$fp "{ \$cl [lda:\$cl.mean configure -count] }" } 
close \$fp

This way, we now have a file named ldaCounts which has en entry for each class (i.e. for each acoustic model), telling us how often it occurred in the training data. We can look at it to detect some anomalies. If we find, that only a few models have eaten up most of the counts, and many models have been seen very rarely or never, then something must have gone wrong, and we better check our scripts. We will also use the counts file in a later development step.