This thesis investigates the problem
of automatic natural language understanding for spoken language systems.
The proposed parsing method is sufficiently general and flexible so as
to be easily ported to different applications, domains and human languages.
Spoken language systems support unconstrained human-machine communication. They combine primary component technologies (such as speech recognition, natural language understanding and dialog processing) to understand the meaning of an input utterance. Natural language generation and/or speech synthesis are required to build end-to-end systems which accomplish some given task.
Today’s state-of-the-art rule-based methods to natural language understanding provide good performance in limited applications for specific languages. However, the manual development of an understanding component using specific rules is costly as each application and language requires its own adaptation or, in the worst case, a completely new implementation. In order to address this cost issue, statistical modeling techniques are used in this work to replace the commonly-used hand-generated rules to convert the speech recognizer output into a semantic representation. The statistical models are derived from the automatic analyses of large corpora of utterances with their corresponding semantic representations. To port the semantic analyzer to different applications and languages it is thus sufficient to train the component on the application- and language-specific data sets as compared to translating and adapting the rule-based grammar by hand.
A stochastic method for natural language understanding was developed and applied to the following tasks and languages: the American ATIS (Air Travel Information Services), the French MASK (Multimodal-Multimedia Automated Service Kiosk) applications and the English Spontaneous Speech Task (ESST). The ATIS and MASK tasks deal with information retrieval for air and train travel, a domain of human-machine interaction. ESST deals with human-to-human interaction in which two people negotiate to schedule a meeting.
In ATIS, the corpora were semantically labeled by the rule-based component which was developed for the French language at the Laboratoire d’Informatique pour la Mécanique et les Sciences de l’Ingénieur (France). This same component was ported to English during the course of this thesis. For MASK, the semantic labels were obtained by integrating the stochastic component into the labeling process using bootstrapping and manual correction. For ESST, the model parameters were trained on a corpus of semantic tree-based representations which were produced by the natural language understanding component of JANUS, a spontaneous speech-to-speech translation system, in part developed at the University of Karlsruhe (Germany) and at Carnegie Mellon University (United States).
In direct comparison the stochastic data-driven
parser is seen to outperform the rule-based method in terms of semantic
accuracy and robustness. Furthermore, the semantic analyzer can be flexibly
ported to new tasks, domains and languages. The strength of such a method
is that the same software can be used regardless of the application and
language. The stochastic models are trained on the specific data sets.
The human effort in component development and porting is therefore limited
to the task of data labeling, which is much simpler than the design, maintenance
and extension of the grammar rules.
3 Fiches. DHS 2569. DHS. Mikroedition.
€ 50,1 (DM 98,00).
ISBN 3-8267-2569-7
Table of Contents
Introduction
Stochastically-based
Case Frame Analysis
Conclusion
Bibliography