Content-Based Texture Analysis and Synthesis for Low Bit-Rate Video Coding Using Perceptual Models

Jain, Anurag

dc.contributor.advisor	Ramakrishnan, K R
dc.contributor.author	Jain, Anurag
dc.date.accessioned	2021-03-19T06:38:55Z
dc.date.available	2021-03-19T06:38:55Z
dc.date.submitted	2006
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/4990
dc.description.abstract	Determining perceptually irrelevant and redundant information from human point of view is one of the fundamental problems today that is limiting the performance of current video compression algorithms. The performance of the existing video compression standards is based on minimizing the cumulative sum of objective distortion, namely mean squared error (MSE), measured for each pixel. Recently there have been quite a few advancements made to understand human visual models and apply them for a compact representation at very low bitrates. However, most of these approaches offer advantages over a very limited range of input sequences using predefined models for analysis of static scene, human head, and human body. The existing video compression standards typically aim to increase the spectral flatness measure of the residue signal, by increasing the number of both spatial and temporal predictors. With the increase in the choices of predictors, the corresponding bits, required to convey the choice of the predictor to the decoder, also increases. This mandates the need for jointly optimizing the distortion and the required side information for a given quantization factor using special rate distortion measures. This thesis is aimed at suggesting alternative solution of removing perceptual redundancy without increasing the number of predictors using two approaches. The first one is to increase the spectral flatness measure by removing perceptually irrelevant residual information. The second one is to model the perceptually relevant residual information loss due to quantization and parameterize the same for synthesizing it at the decoder end. This basically evolves around two analytical and estimation problems. The first problem is to identify the perceptually irrelevant quantization noise and remove it from the resulting source. The second problem is to model the perceptually relevant quantization noise. The first contribution of this dissertation is to classify regions into homogenous / non-homogenous and rigid / non-rigid, based on different perceptual ques like variance, edge, color, and motion. Quantization noise for each region is shaped differently to ensure minimal perceptual quality degradation. At very low bitrates, the rigid regions with small residue errors results in AC coefficients which are small in magnitude. These coefficients, which typically get quantized to zero value, are regenerated / synthesized at the decoder end using statistical characteristics of the temporal predictors. The regions are coarsely segmented based on edge, color, and motion descriptors. Regions with rigid texture are more optimized for rate compared to distortion using higher values of quantization parameter. The second contribution of this dissertation is identification and representation of non-rigid textured regions like grass, flowing water etc. with a dense motion vector field (DMVF) instead of conventional motion compensated signal. The analysis part contains identification of such regions and classification of macroblocks into rigid and non-rigid homogenous textures. The DMVF is computed only for the macroblocks classified under non-rigid textured regions. A replacement technique is used to substitute a block of texture pixels with a block of motion vectors which are then differentially coded using causal neighbors and context adaptive binary arithmetic coding (CABAC). As a part of texture synthesis, the decoder then simply decodes these motion vectors, regenerates the DMVF and compensates each pixel individually using the regenerated DMVF. The remaining macroblocks which are not classified as homogenous texture (rigid or non-rigid) are coded using conventional H.264 encoder. Although the underlying techniques are generic enough to be augmented with any video standard, we specifically picked H.264 video compression standard considering it is the current state-of-the-art. We compare coding approaches using NTIA model for objective measure of subjective quality. Comparing our techniques with H.264 standard compliant JM encoder developed by JVT (Joint Video Technology) committee members, we got a bit-rate savings of around 15%. The chapters of this dissertation are organized as follows. An introduction to the H.264 standard features and improvements made over several years over existing video standards like MPEG-2 and H.263 are presented in Chapter 1. It also consists of highlighting some of the techniques published to reduce the computation complexity for enabling real-time implementation of encoders. A literature survey of existing techniques which use perceptual criterion for video coding is presented in Chapter 2. Chapter 3 highlights some of the limitations of schemes mentioned in the literature and is followed by the contributions made in the present work to overcome these limitations. Experimental results are presented in Chapter 4 and the thesis is concluded in Chapter 5 highlighting some of the future work which could be carried out in this direction.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	G21040;
dc.rights	I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation	en_US
dc.subject	Video Recording	en_US
dc.subject	Data Compression	en_US
dc.subject	Texture Analysis	en_US
dc.subject	Perceptual Ques	en_US
dc.subject	Dense Motion Vector Field (DMVF)	en_US
dc.subject	Context Adaptive Binary Arithmetic Coding (CABAC)	en_US
dc.subject	Content Based Texture Coding (CBTC)	en_US
dc.subject.classification	Electrical Engineering	en_US
dc.title	Content-Based Texture Analysis and Synthesis for Low Bit-Rate Video Coding Using Perceptual Models	en_US
dc.type	Thesis	en_US
dc.degree.name	MTech (Res)	en_US
dc.degree.level	Masters	en_US
dc.degree.grantor	Indian Institute of Science	en_US
dc.degree.discipline	Engineering	en_US

Files in this item

Name:: G21040-Abs.pdf
Size:: 92.32Kb
Format:: PDF
Description:: Thesis-Abstract

View/Open

Name:: G21040.pdf
Size:: 1.339Mb
Format:: PDF
Description:: Thesis-Full Text

View/Open

This item appears in the following Collection(s)

Electrical Engineering (EE) [392]

Show simple item record