Refine
Year of publication
- 2013 (4) (remove)
Document Type
- Conference Proceeding (3)
- Part of a Book (1)
Has Fulltext
- no (4)
Is part of the Bibliography
- no (4)
Institute
In this work methods are described, which are used for an individual adaption of a dialog system. Anyway, an automatic real-time capable visual user attention estimation for a face to face human machine interaction is described. Furthermore, an emotion estimation is presented, which combines a visual and an acoustic method. Both, the attention estimation and the visual emotion estimation based on Active Appearance Models (AAMs). Certainly, for the attention estimation Multilayer Perceptrons (MLPs) are used to map the Active Appearance Parameters (AAM-Parameters) onto the current head pose. Afterwards, the chronology of the head poses is classified as attention or inattention. In the visual emotion estimation the AAM-Parameter will be classified by a Support-Vector-Machine (SVM). The acoustic emotion estimation also use a SVM to classifies emotion related audio signal features into the 5 basis emotions (neutral, happy, sad, anger, surprise). Afterward, a Bayes network is used to combine the results of the visual and the acoustic estimation in the decision level. The visual attention estimation as well as the emotion estimation will be used in service robotic to allow a more natural and human like dialog. Furthermore, the human head pose is very efficient interpreted as head nodding or shaking by the use of adaptive statistical moments. Especially, the head movement of many demented people are restricted, so they often only use their eyes to look around. For that reason, this work examine a simple gaze estimation with the help of an ordinary webcam. Moreover, a full body user re-identification method is described, which allows an individual state estimation of several people for hight dynamic situations. In this work an appearance based method is described, which allows a fast people re-identification over a short time span to allow the usage of individual parameter.
In this paper, we describe a method to model human clothes for a later recognition by the use of RGB- and SWIR-cameras. A basic model is estimated during people detection and tracking. This model will be refined if the recognition is triggered. For the refining, several saliency maps are used to extract individual features. These individual features are located separately for any human body parts. The body parts are estimated by the use of a silhouette extraction combined with a skeleton estimation. In this way, the model describes the human clothes in a compact manner which allows the use of a simple and fast comparison method for people recognition. Such models can be used in security and service applications.