dc.contributor.advisor Kashyap, Navin dc.contributor.author Shivkumar, K M dc.date.accessioned 2020-12-07T10:50:07Z dc.date.available 2020-12-07T10:50:07Z dc.date.submitted 2018 dc.identifier.uri https://etd.iisc.ac.in/handle/2005/4719 dc.description.abstract Let A be a finite alphabet and A be the set of all finite sequences of the elements of A. A word is any en_US member of A . A prefix code X is a set of words satisfying the prefix property, i.e., no word in the set is a prefix of any other word in the set. If X is defined as the collection of all concatenations of the words of X, then it can be seen that each of its elements can be expressed as a concatenation of the words of X in a unique manner. Any subset of A possessing this property is called a uniquely decodable code and the prefix codes constitute an important subclass of uniquely decodable codes. In our work, we look into the following questions involving prefix codes: i) We first study a parameter associated with prefix codes. For a discrete source with source distribution P, the problem of constructing a prefix code over the alphabet A with the minimum expected length is one of the earliest problems addressed in information theory. Let LD(P) (D = jAj) denote the minimum expected length of a prefix code for this source. This LD(P) is the parameter of our interest and can be seen as a function—call it the minimum expected length function LD—over the set Pn of all probability mass functions (PMF) of the form (p1; p2; : : : ; pn). It is well known that LD attains its maximum value at the uniform distribution Un = (1=n; 1=n; : : : ; 1=n). However, a characterization of all the PMFs at which this function attains a maximum value has not been carried out before, which we do in this work. This characterization also suggests the following problem: do the restrictions of LD over certain subsets of Pn attain maximum values in their respective domains? If so, what are the PMFs at which these maximum values are attained? We give a partial solution to this problem for the binary case D = 2. ii) We introduce the problem of finding a minimum expected length binary prefix code (hereafter known as an optimal code) among the prefix codes that satisfy the following constraint: all the possible concatenations of the codewords must satisfy the (d; k) runlength-limited (RLL) constraint, i.e., the number of zeros between any two successive ones in them is at least d and the length of any run of consecutive zeros is at most k. For certain (d; k) pairs, we show that this problem can be reduced to a well-studied problem of finding a prefix code with the minimum expected cost when each letter of the alphabet has a non-negative cost associated with it. Also, for these (d; k) pairs, we examine if the optimal codes satisfy a certain maximality property defined with respect to the prefix condition and the RLL constraint. iii) We then study a property of prefix codes: of it being synchronous or not. A prefix code is said to be synchronous if there exists a word x 2 A such that for all w 2 A , we have wx 2 X . Capocelli et al. (1988) have given an algorithm to determine if a given prefix code is synchronous or not, which has subsequently been improved. In our work, we devise an algorithm based on the notion of dangling suffixes, similar to the classical Sardinas-Patterson test for determining whether or not a given code is uniquely decodable. We show that our algorithm has a much better worst-case performance when compared to that 2 of the improved version of the algorithm of Capocelli et al. In this process, we also slightly improve upon the known necessary and sufficient condition for a prefix code to be synchronous. iv) Finally we look into a class of prefix codes called the bifix codes. A bifix code is a prefix code in which no codeword is a suffix of another codeword. For a finite sequence of non-decreasing natural numbers L = (l1; l2; : : : ; ln), there is no known efficient algorithm to determine the existence of a bifix code whose sequence of codeword lengths is the same as L (henceforth referred to as a bifix code for L). For a finite sequence L taking only two distinct integer values (called a two-level sequence), we show that the problem of deciding the existence of a bifix code for L can be converted to a problem of finding a particular subset of vertices from certain graphs derived from de Bruijn graphs. We then conjecture an efficient way of finding these subsets. Ahlswede’s conjecture (1996) is another problem which has led to a lot of work in the field of bifix codes. It states that if a sequence L has a Kraft sum ( P i 2􀀀li ) less than or equal to 3=4, then there exists a binary bifix code for L. This conjecture has been proved when L is a two-level sequence. We give an alternate proof of this by pointing out a new general way of constructing a bifix code for a two-level sequence L with Kraft sum less than or equal to 3=4. dc.language.iso en_US en_US dc.relation.ispartofseries ;G29592 dc.rights I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part en_US of this thesis or dissertation dc.subject uniquely decodable code en_US dc.subject bifix code en_US dc.subject probability mass functions en_US dc.title On Some Questions Involving Prefix Codes en_US dc.type Thesis en_US dc.degree.name PhD en_US dc.degree.level Doctoral en_US dc.degree.grantor Indian Institute of Science en_US dc.degree.discipline Engineering en_US
﻿