Bitrate Reduction Techniques for Low-Complexity Surveillance Video Coding
Abstract
High resolution surveillance video cameras are invaluable resources for effective crime prevention and forensic investigations. However, increasing communication bandwidth requirements of high definition surveillance videos are severely limiting the number of cameras that can be deployed. Higher bitrate also increases operating expenses due to higher data communication and storage costs. Hence, it is essential to develop low complexity algorithms which reduce data rate of the compressed video stream without affecting the image fidelity. In this thesis, a computer vision aided H.264 surveillance video encoder and four associated algorithms are proposed to reduce the bitrate. The proposed techniques are (I) Speeded up foreground segmentation, (II) Skip decision, (III) Reference frame selection and (IV) Face Region-of-Interest (ROI) coding.
In the first part of the thesis, a modification to the adaptive Gaussian Mixture Model (GMM) based foreground segmentation algorithm is proposed to reduce computational complexity. This is achieved by replacing expensive floating point computations with low cost integer operations. To maintain accuracy, we compute periodic floating point updates for the GMM weight parameter using the value of an integer counter. Experiments show speedups in the range of 1.33 - 1.44 on standard video datasets where a large fraction of pixels are multimodal.
In the second part, we propose a skip decision technique that uses a spatial sampler to sample pixels. The sampled pixels are segmented using the speeded up GMM algorithm. The storage pattern of the GMM parameters in memory is also modified to improve cache performance. Skip selection is performed using the segmentation results of the sampled pixels. In the third part, a reference frame selection algorithm is proposed to maximize the number of background Macroblocks (MB’s) (i.e. MB’s that contain background image content) in the Decoded Picture Buffer. This reduces the cost of coding uncovered background regions. Distortion over foreground pixels is measured to quantify the performance of skip decision and reference frame selection techniques. Experimental results show bit rate savings of up to 94.5% over methods proposed in literature on video surveillance data sets. The proposed techniques also provide up to 74.5% reduction in compression complexity without increasing the distortion over the foreground regions in the video sequence.
In the final part of the thesis, face and shadow region detection is combined with the skip decision algorithm to perform ROI coding for pedestrian surveillance videos. Since person identification requires high quality face images, MB’s containing face image content are encoded with a low Quantization Parameter setting (i.e. high quality). Other regions of the body in the image are considered as RORI (Regions of reduced interest) and are encoded at low quality. The shadow regions are marked as Skip. Techniques that use only facial features to detect faces (e.g. Viola Jones face detector) are not robust in real world scenarios. Hence, we propose to initially detect pedestrians using deformable part models. The face region is determined using the deformed part locations. Detected pedestrians are tracked using an optical flow based tracker combined with a Kalman filter. The tracker improves the accuracy and also avoids the need to run the object detector on already detected pedestrians. Shadow and skin detector scores are computed over super pixels. Bilattice based logic inference is used to combine multiple likelihood scores and classify the super pixels as ROI, RORI or RONI. The coding mode and QP values of the MB’s are determined using the super pixel labels. The proposed techniques provide a further reduction in bitrate of up to 50.2%.
Collections
Related items
Showing items related by title, author, creator and subject.
-
Multiview Face Detection And Free Form Face Recognition For Surveillance
Anoop, K R (2014-10-15)The problem of face detection and recognition within a given database has become one of the important problems in computer vision. A simple approach for Face Detection in video is to run a learning based face detector every ... -
Towards Robust and Scalable Video Surveillance: Cross-modal and Domain Generalizable Person Re-identification
Jambigi, ChaitraWith rapid technological advances, one can easily find video surveillance systems deployed in public places such as malls, airports etc. as well as across private residential areas. These systems play a critical role in ... -
Visual Flow Analysis and Saliency Prediction
Srinivas, Kruthiventi S S (2017-12-16)Nowadays, we have millions of cameras in public places such as traffic junctions, railway stations etc., and capturing video data round the clock. This humongous data has resulted in an increased need for automation of ...