Dominance based post-processing in ASR system
[ 1 ] Instytut Informatyki (II), Wydział Informatyki i Zarządzania, Politechnika Poznańska | [ P ] employee
2003
paper
english
EN
This paper presents different approaches to post-processing of N-best recognition hypotheses. When N-best solutions are obtained from an HMM-based recognizer, the segmental scores can be computed for each. Statistical modeling of a segment involves two models: for correct and incorrect labeling; such two models are built for each phoneme in the dictionary. Segmental score for the chosen feature (such as phoneme duration) is calculated as the total log likelihood normalized by the number of segments. In the described method, it is possible that there are many segmental scores (coming from different features, possibly based on differently defined 'segments'). If it is the case, they have to be combined to one confidene score.
In this paper the examined features are phoneme-duration and markovian likelihood. The effectiveness of each segmental model is presented and two methods of combining confidence scores are compared: statistical and dominance-based rough-set approach. It is shown that when all integrated scores are likelihoods (thus the "preference direction" is defined) the dominance-based approach can yield better results than Gaussian-mixture statistical modeling.
41 - 46