Bitrate Reduction Techniques for Low-Complexity Surveillance Video Coding

Gorur, Pushkar

dc.contributor.advisor	Amrutur, Bharadwaj
dc.contributor.author	Gorur, Pushkar
dc.date.accessioned	2017-09-26T06:07:41Z
dc.date.accessioned	2018-07-31T04:48:56Z
dc.date.available	2017-09-26T06:07:41Z
dc.date.available	2018-07-31T04:48:56Z
dc.date.issued	2017-09-26
dc.date.submitted	2016
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/2681
dc.identifier.abstract	http://etd.iisc.ac.in/static/etd/abstracts/3502/G27564-Abs.pdf	en_US
dc.description.abstract	High resolution surveillance video cameras are invaluable resources for effective crime prevention and forensic investigations. However, increasing communication bandwidth requirements of high definition surveillance videos are severely limiting the number of cameras that can be deployed. Higher bitrate also increases operating expenses due to higher data communication and storage costs. Hence, it is essential to develop low complexity algorithms which reduce data rate of the compressed video stream without affecting the image fidelity. In this thesis, a computer vision aided H.264 surveillance video encoder and four associated algorithms are proposed to reduce the bitrate. The proposed techniques are (I) Speeded up foreground segmentation, (II) Skip decision, (III) Reference frame selection and (IV) Face Region-of-Interest (ROI) coding. In the first part of the thesis, a modification to the adaptive Gaussian Mixture Model (GMM) based foreground segmentation algorithm is proposed to reduce computational complexity. This is achieved by replacing expensive floating point computations with low cost integer operations. To maintain accuracy, we compute periodic floating point updates for the GMM weight parameter using the value of an integer counter. Experiments show speedups in the range of 1.33 - 1.44 on standard video datasets where a large fraction of pixels are multimodal. In the second part, we propose a skip decision technique that uses a spatial sampler to sample pixels. The sampled pixels are segmented using the speeded up GMM algorithm. The storage pattern of the GMM parameters in memory is also modified to improve cache performance. Skip selection is performed using the segmentation results of the sampled pixels. In the third part, a reference frame selection algorithm is proposed to maximize the number of background Macroblocks (MB’s) (i.e. MB’s that contain background image content) in the Decoded Picture Buffer. This reduces the cost of coding uncovered background regions. Distortion over foreground pixels is measured to quantify the performance of skip decision and reference frame selection techniques. Experimental results show bit rate savings of up to 94.5% over methods proposed in literature on video surveillance data sets. The proposed techniques also provide up to 74.5% reduction in compression complexity without increasing the distortion over the foreground regions in the video sequence. In the final part of the thesis, face and shadow region detection is combined with the skip decision algorithm to perform ROI coding for pedestrian surveillance videos. Since person identification requires high quality face images, MB’s containing face image content are encoded with a low Quantization Parameter setting (i.e. high quality). Other regions of the body in the image are considered as RORI (Regions of reduced interest) and are encoded at low quality. The shadow regions are marked as Skip. Techniques that use only facial features to detect faces (e.g. Viola Jones face detector) are not robust in real world scenarios. Hence, we propose to initially detect pedestrians using deformable part models. The face region is determined using the deformed part locations. Detected pedestrians are tracked using an optical flow based tracker combined with a Kalman filter. The tracker improves the accuracy and also avoids the need to run the object detector on already detected pedestrians. Shadow and skin detector scores are computed over super pixels. Bilattice based logic inference is used to combine multiple likelihood scores and classify the super pixels as ROI, RORI or RONI. The coding mode and QP values of the MB’s are determined using the super pixel labels. The proposed techniques provide a further reduction in bitrate of up to 50.2%.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	G27564	en_US
dc.subject	Bitrate Reduction	en_US
dc.subject	Surveillance Video Coding	en_US
dc.subject	Video Surveillance	en_US
dc.subject	Gaussian Mixture Model (GMM)	en_US
dc.subject	Pedestrian Surveillance Cameras	en_US
dc.subject	Region of Interest (ROI) Video Coding	en_US
dc.subject	Surveillance Video Cameras	en_US
dc.subject	Video Coding	en_US
dc.subject	Encoding	en_US
dc.subject	Computational Complexity Reduction	en_US
dc.subject	H.264 Surveillance Coding	en_US
dc.subject	Gaussian Mixture Model Algorithm	en_US
dc.subject.classification	Electrical Communication Engineering	en_US
dc.title	Bitrate Reduction Techniques for Low-Complexity Surveillance Video Coding	en_US
dc.type	Thesis	en_US
dc.degree.name	PhD	en_US
dc.degree.level	Doctoral	en_US
dc.degree.discipline	Faculty of Engineering	en_US