Refine
Year of publication
Document Type
- Conference Proceeding (63) (remove)
Is part of the Bibliography
- no (63)
Keywords
We present a pipeline for recognizing dynamic freehand gestures on mobile devices based on extracting depth information coming from a single Time-of-Flight sensor. Hand gestures are recorded with a mobile 3D sensor, transformed frame by frame into an appropriate 3D descriptor and fed into a deep LSTM network for recognition purposes. LSTM being a recurrent neural model, it is uniquely suited for classifying explicitly time-dependent data such as hand gestures. For training and testing purposes, we create a small database of four hand gesture classes, each comprising 40 × 150 3D frames. We conduct experiments concerning execution speed on a mobile device, generalization capability as a function of network topology, and classification ability ‘ahead of time’, i.e., when the gesture is not yet completed. Recognition rates are high (>95%) and maintainable in real-time as a single classification step requires less than 1 ms computation time, introducing freehand gestures for mobile systems.
We present a system for 3D hand gesture recognition based on low-cost time-of-flight(ToF) sensors intended for outdoor use in automotive human-machine interaction. As signal quality is impaired compared to Kinect-type sensors, we study several ways to improve performance when a large number of gesture classes is involved. Our system fuses data coming from two ToF sensors which is used to build up a large database and subsequently train a multilayer perceptron (MLP). We demonstrate that we are able to reliably classify a set of ten hand gestures in real-time and describe the setup of the system, the utilised methods as well as possible application scenarios.
In this contribution we present a novel approach to transform data from time-of-flight (ToF) sensors to be interpretable by Convolutional Neural Networks (CNNs). As ToF data tends to be overly noisy depending on various factors such as illumination, reflection coefficient and distance, the need for a robust algorithmic approach becomes evident. By spanning a three-dimensional grid of fixed size around each point cloud we are able to transform three-dimensional input to become processable by CNNs. This simple and effective neighborhood-preserving methodology demonstrates that CNNs are indeed able to extract the relevant information and learn a set of filters, enabling them to differentiate a complex set of ten different gestures obtained from 20 different individuals and containing 600.000 samples overall. Our 20-fold cross-validation shows the generalization performance of the network, achieving an accuracy of up to 98.5% on validation sets comprising 20.000 data samples. The real-time applicability of our system is demonstrated via an interactive validation on an infotainment system running with up to 40fps on an iPad in the vehicle interior.
Pedestrian movement analysis at airports - videobased analysis across multiple camera systems
(2013)
Analyse dynamischer Szenen
(1999)
In diesem Artikel wird die Analyse dynamischer Szenen im Rahmen einer flexiblen Architektur zur Lösung von Fahrerassistenzaufgaben in Kraftfahrzeugen vorgestellt. Die Lösung unterschiedlicher Aufgaben mit verwandten Ansätzen bedingt einen hohen Grad an Modularität und Flexibilität. Nur so können die gestellten Aufgaben mit den vorhandenen Algorithmen optimal gelöst werden. In der vorgestellten Architektur wird eine objektbezogene Analyse von Sensordaten, eine verhaltensbasierte Szeneninterpretation und eine Verhaltensplanung durchgeführt. Eine globale Wissensbasis, auf der jedes einzelne Modul arbeitet, beinhaltet die Beschreibung physikalischer Zusammenhänge, Verhaltensregeln für den Straßenverkehr, sowie Objekt- und Szenenwissen.
Externes Wissen (z.B. GPS – Global Positioning System) kann ebenfalls in die Wissensbasis eingebunden werden. Als Anwendungsbeispiel der Verhaltensplanung ist ein intelligenter Tempomat realisiert.
Touch versus mid-air gesture interfaces in road scenarios-measuring driver performance degradation
(2016)
We present a study aimed at comparing the degradation of the driver's performance during touch gesture vs mid-air gesture use for infotainment system control. To this end, 17 participants were asked to perform the Lane Change Test. This requires each participant to steer a vehicle in a simulated driving environment while interacting with an infotainment system via touch and mid-air gestures. The decrease in performance is measured as the deviation from an optimal baseline. This study concludes comparable deviations from the baseline for the secondary task of infotainment interaction for both interaction variants. This is significant as all participants are experienced in touch interaction, however have had no experience at all with mid-air gesture interaction, favoring mid-air gestures for the long-term scenario.