Erasure Codes for Distributed Storage: Tight Bounds and Matching Constructions

Balaji, S B

dc.contributor.advisor	Vijay Kumar, P
dc.contributor.author	Balaji, S B
dc.date.accessioned	2021-09-22T09:46:09Z
dc.date.available	2021-09-22T09:46:09Z
dc.date.submitted	2018
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/5330
dc.description.abstract	The reliable storage of Big Data across a spatially distributed network of nodes, calls for erasure-correcting codes that in addition to protecting against data loss, can also efficiently handle node repair. The need for node repair could arise on account of device failure, need for a maintenance reboot, or simply because the node is busy serving other demands. An important consideration here is the rate of the code, which is the ratio of the number of data symbols to the total amount of storage needed to reliably store these data symbols. In response, coding theorists have come up with two new classes of codes, known respectively as regenerating codes and Locally Recoverable Codes (LRC). While the focus of the thesis is on LRC, there are also contributions to the theory of regenerating codes. Contributions to LRC: A LRC is quite simply, a code where a given code symbol can be recovered by contacting at most r other code symbols, where the parameter r is much smaller than the dimension k of the code. A LRC with sequential recovery, is a code that can recover from an arbitrary set of trerasures in t steps in a sequential fashion. Each step recovers an erased symbol and makes use of at most r other code symbols comprising of unerased symbols as well as previously recovered symbols. In this thesis, a tight upper bound on the rate of LRC with sequential recovery is provided, for any value of the number t of erasures and any value of the locality parameter r ≥ 3. This bound proves an earlier conjecture due to Song, Cai and Yuen. While the bound is valid irrespective of the field over which the code is defined, a matching construction of binary codes that achieve the upper bound on rate is also presented. Contributions to Regenerating Codes: Regenerating codes aim to minimize the amount of data download needed to repair a failed node. Regenerating codes are linear codes that operate over a vector alphabet, i.e., each code symbol in a regenerating code is a vector of α symbols drawn from a field F. An important open question relates to the minimum possible value of α for a given storage overhead. Here we present tight lower bounds on α for the case when the codes belong to a certain class of codes called MSR codes as well as have the property of optimal access, i.e., symbols are accessed and transmitted as such without any computation by helper node for repair of a failed node. Contribution to availability Codes: A code in which each code symbol can be recovered in t different ways using respectively t pairwise disjoint set of code symbols with each set of size at most r is called a code with t-availability. The contributions of the thesis in the direction of t-availability codes include improved upper bounds on the minimum distance dmin of this class of codes, both with and without a constraint on the size q of the code-symbol alphabet. An improved upper bound on code rate R is also provided for a subclass of t-availability codes, termed as codes with strict availability. Among the class of t-availability codes, codes with strict availability typically have high rate. A complete characterization of optimal tradeoff between rate and fractional minimum distance for a special class of t-availability codes is also provided. There are additional results which are not mentioned above including results relating to a class of codes called maximum recoverable codes.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	;G29472
dc.rights	I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation	en_US
dc.subject	Big Data	en_US
dc.subject	erasure-correcting codes	en_US
dc.subject	node repair	en_US
dc.subject	Locally Recoverable Codes	en_US
dc.subject	Regenerating codes	en_US
dc.subject	code with t-availability	en_US
dc.subject.classification	Research Subject Categories::TECHNOLOGY::Information technology	en_US
dc.title	Erasure Codes for Distributed Storage: Tight Bounds and Matching Constructions	en_US
dc.type	Thesis	en_US
dc.degree.name	PhD	en_US
dc.degree.level	Doctoral	en_US
dc.degree.grantor	Indian Institute of Science	en_US
dc.degree.discipline	Engineering	en_US

Files in this item

Name:: G29472.pdf
Size:: 4.229Mb
Format:: PDF
Description:: Thesis full text

View/Open

This item appears in the following Collection(s)

Electrical Communication Engineering (ECE) [461]

Show simple item record