Erasure Codes for Distributed Storage: Tight Bounds and Matching Constructions
Abstract
The reliable storage of Big Data across a spatially distributed network of nodes, calls for erasure-correcting codes that in addition to protecting against data loss, can also efficiently handle node repair. The need for node repair could arise on account of device failure, need for a maintenance reboot, or simply because the node is busy serving other demands. An important consideration here is the rate of the code, which is the ratio of the number of data symbols to the total amount of storage needed to reliably store these data symbols.
In response, coding theorists have come up with two new classes of codes, known respectively as regenerating codes and Locally Recoverable Codes (LRC). While the focus of the thesis is on LRC, there are also contributions to the theory of regenerating codes.
Contributions to LRC: A LRC is quite simply, a code where a given code symbol can be recovered by contacting at most r other code symbols, where the parameter r is much smaller than the dimension k of the code. A LRC with sequential recovery, is a code that can recover from an arbitrary set of trerasures in t steps in a sequential fashion. Each step recovers an erased symbol and makes use of at most r other code symbols comprising of unerased symbols as well as previously recovered symbols. In this thesis, a tight upper bound on the rate of LRC with sequential recovery is provided, for any value of the number t of erasures and any value of the locality parameter r ≥ 3. This bound proves an earlier conjecture due to Song, Cai and Yuen. While the bound is valid irrespective of the field over which the code is defined, a matching construction of binary codes that achieve the upper bound on rate is also presented.
Contributions to Regenerating Codes: Regenerating codes aim to minimize the amount of data download needed to repair a failed node. Regenerating codes are linear codes that operate over a vector alphabet, i.e., each code symbol in a regenerating code is a vector of α
symbols drawn from a field F. An important open question relates to the minimum possible value of α for a given storage overhead. Here we present tight lower bounds on α for the case when the codes belong to a certain class of codes called MSR codes as well as have the property of optimal access, i.e., symbols are accessed and transmitted as such without any computation by helper node for repair of a failed node.
Contribution to availability Codes: A code in which each code symbol can be recovered in t different ways using respectively t pairwise disjoint set of code symbols with each set of size at most r is called a code with t-availability. The contributions of the thesis in the direction of t-availability codes include improved upper bounds on the minimum distance dmin of this class of codes, both with and without a constraint on the size q of the code-symbol alphabet. An improved upper bound on code rate R is also provided for a subclass of t-availability codes, termed as codes with strict availability. Among the class of t-availability codes, codes with strict availability typically have high rate. A complete characterization of optimal tradeoff between rate and fractional minimum distance for a special class of t-availability codes is also provided.
There are additional results which are not mentioned above including results relating to a class of codes called maximum recoverable codes.