Multimodal Dialog Systems / Human-Machine-Interaction
Overview of the perceptual components in human-robot interaction
In everyday life the amount of services made available to humans by machines keeps increasing. One key influence on how useful technology proves is the ability to control it simply and effectively. Natural multimodal dialog based on speech forms one of the most promising approaches.
Our research in multimodal human-machine interaction focuses on humanoid robots. We develop more humanlike interactions by integrating different modalities for input as well as output. Speech plays the dominant role as it does in human to human interaction. But other cues like gestures, body pose carry important information as well, e. g. it is possible to use speech and pointing gestures to express the user’s intention. Our goal is to make use of these additional channels as much as possible. Thus we intend to make interaction more reliable and more appealing. Another way to make the interaction more appealing is to process the user’s output incremenetally to reduce the latency. This is based on the hypothesis that for a human judge the behavior is more important than the physical appearance.
Knowledge representation and extraction of information from multimodal input.We analyze how the robot's knowledge of the world and its own past actions can be given in a way that facilitates expanding and maintaining it as well as (re)creating it from input or transforming it into output.
In many if not most cases the interaction with the user consists of a sequence of steps. We investigate the control of this sequence, what is known as dialog management. What are good ways of creating new sequences for new problems or finding optimal sequences for known problems? The autonomy of the system is of decisive importance. The system shall be able to use corrections of the user to correct itself and determine the need for communication and initiate it. This encompasses the verification of information previously gathered, e.g. clarification or error recovery sequences in ongoing interactions or new interactions that resolve irregularities found in the knowledge base. Moreover, we investigate how we can equip humanoid robots with an episodic memory to enable conversation about past events and experiences. Specific challenges here include the selection of relevant information to keep in memory while avoiding unnecessary events, as well as the correct retrieval of information given a user query. The system must generate behavior that does not violate fundamental principles of communication, i.e. how information is ordered by importance and how it is distributed/packed densely enough to be effective and easy to take in.