Neuromorphic Vision: Asynchronous and Sparse Processing of Event Camera Data
Abstract
Modern systems, including edge computing, autonomous vehicles and robotics, face constraints, such as limited processing power and restricted battery energy, making computational efficiency a critical bottleneck for their smooth functioning. The demand for computational efficiency is exacerbated by the critical need for low latency across various domains. For example, in autonomous robotics, swift response time is imperative to prevent potential accidents, ensure seamless human-robot collaboration, and enable real-time threat assessment in surveillance scenarios. Consequently, optimizing computing and minimizing latency has emerged as a paramount concern across a wide range of applications, ranging from autonomous robots to surveillance. In such contexts, developing rapid and lightweight perception systems is essential for timely and safe response.
Traditional perception solutions often rely on conventional cameras that capture dense grid data, leading to computationally intensive and slow vision algorithms. While these approaches may still be effective in certain applications, their limitations become evident in scenarios where rapid and resource-constrained perception is crucial. Event-based cameras, also known as neuromorphic sensors, have been a subject of interest for over 15 years due to their unique ability to asynchronously capture sparse data triggered by changes in the scene or ego-motion. This asynchronous nature of event-based cameras results in significantly reduced computational overhead and minimal latency, enabling ultra-fast response times and unparalleled efficiency in dynamic scenes.
Nevertheless, a significant challenge in event camera perception arises from the inherent difference in its data compared to conventional cameras. Dense and synchronous algorithms designed for standard cameras cannot be directly applied to the sparse and asynchronous event camera data. Hence, to unlock the full potential of this transformative sensing technology, there is a pressing need to develop specialized algorithms tailored to the unique characteristics of event cameras. Through our research, we envision tackling this challenge by proposing event camera-centric algorithms for i) task-agnostic pre-processing and ii) task-specific processing of event camera data. Our algorithms will empower intelligent systems to perceive the world with computing efficiency and low latency.
First, we present novel task-agnostic pre-processing techniques aimed at enhancing the compatibility between event camera data and well-established conventional frame-based vision algorithms. Through these approaches, this thesis opens up new possibilities for seamlessly integrating event cameras with conventional vision algorithms. These approaches involve the meticulous design of novel data representation techniques specifically tailored to honour the asynchronous and sparse nature of event data. Through meticulous use of sparsity and asynchronous operations, we drastically reduce the computational overhead. We demonstrate the efficacy of our novel data representation techniques across a wide spectrum of event vision tasks such as object recognition, action recognition, anomaly detection and gesture recognition.
Second, we present task-specific asynchronous event-only processing algorithms, opening new avenues for denoising, noise modelling, human instance segmentation and road segmentation. These are some of the key applications of resource-constrained platforms. Denoising and noise modelling are useful techniques for improving data quality and system robustness by providing a better understanding of the noise present in the data. This knowledge can then be leveraged to design more effective processing and analysis strategies that work with the noise
rather than simply filtering it out. By focusing on human instance segmentation and road segmentation, we pave the way for more reliable and effective vision systems that meet the needs of modern applications in surveillance, autonomous driving and robotics. These techniques seamlessly operate in asynchronous and sparse space, thereby enabling machines to perceive their environment with improved efficiency and speed. These algorithms bring a synergy of statistical techniques and deep learning algorithms with event-based sensing.