Codes With Combined Locality and Regeneration Having Optimal Minimum Distance and Linear Field Size
Abstract
This thesis presents error correcting codes in the setting of distributed storage. The codes presented in this thesis are optimal with respect to the minimum distance bound and have a finite field size which grows linearly with the block length of the code.
Setting of Distributed Storage: In the current digital era, a very large amount of data is generated that needs to be reliably stored. This data is stored in distributed fashion, in a distributed storage system (DSS). To guarantee reliability, these DSSs need to speedily repair any node which is unavailable for use or has failed. In the past, triple replication was the standard used for ensuring reliability in a DSS. More recently, given the excessive overhead associated with triple replication, the use of erasure coding has become increasingly more frequent. In an $[n,k]$ erasure code, the file is first split into $k$ packets and $(n-k)$ redundant parity packets are then added, to form the erasure code consisting of $n$ packets or chunks. These $(n-k)$ parity packets are computed using a linear combination of the $k$ information packets. Erasure codes are excellent when it comes to data recovery but are inefficient when it comes to repairing a failed node. In a DSS the failure of a single node is the most common failure (98\% of the time), thus the repair of a failed node must be handled efficiently both to reduce the amount of traffic over the network associated with node repair as well as the number of helper nodes contacted during node repair. To overcome this hurdle, two new classes of erasure codes namely Regenerating Codes (RGC) and Locally Recoverable Codes (LRC) were introduced. RGCs aim to minimize the repair bandwidth (i.e., the amount of data needed for repairing the failed node) while LRCs aim to minimize the repair degree (i.e., the number of surviving or helper nodes contacted for repairing the failed node).
Regenerating Codes
In an RGC, the file of size $B$ over the finite field ${{F}}_q$ is encoded into $n\alpha$ symbols and stored across $n$ nodes where each node comprises of $\alpha$ symbols. RGCs satisfy two properties, the data-collection property and the node-repair property. The data-collection property requires the DSS to recover the original data corresponding to the file of size $B$, by contacting any $k$ out of $n$ nodes and downloading $\alpha$ symbols from each contacted node. The node-repair property requires the DSS to repair a failed node by contacting any $d$ nodes from the surviving $(n-1)$ nodes and downloading $\beta \leq \alpha$ symbols from each contacted node to reconstruct the failed node of size $\alpha$ while ensuring the data collection and node repair properties continue to hold. Based on the type of node repair, RGCs can be further sub divided into two categories, functional repair RGC (FR-RGC) and exact repair RGC (ER-RGC). In ER-RGC, the contents of the repaired node are the same as that of the failed node, while in FR-RGC, the contents of the repaired node need not be same as the failed node, the only condition being that the data collection and node repair properties continue to hold. In this thesis, we will restrict attention to the case of ER-RGC. For an $((n,k= ,d),(\alpha,\beta),B)$ RGC there exists a tradeoff between the sub-packetization level $\alpha$ of the RGC and the repair bandwidth $d\beta$ known as the storage repair bandwidth tradeoff. The two extreme points in the storage bandwidth tradeoff curve are of special importance as they correspond to minimum possible storage overhead and minimum possible repair bandwidth, respectively. The codes which minimize $\alp= ha$ are known as minimum storage regenerating (MSR) codes and the RGCs which minimize $d\beta$ are known as minimum bandwidth regenerating (MBR) codes. MSR codes can be shown to be example of maximum distance separable (MDS) codes having a vector code-symbol alpha= bet.
Locally Recoverable Codes
In this thesis we restrict our attention to Locally Recoverable Codes (LRC) over a finite field $\mathbb{F}_q$. Let $[n,k,d]$ denote a linear code having block length $n$ dimension $k$ and minimum distance $d$. In the case of an LRC, there is an additional parameter $r$, termed as the locality parameter $r$ and which indicates that an erased code symbol can be reconstructed by contacting a set of $r$ un-erased code symbols and $r$ is typically much smaller than the dimension $k$ of the code. In the context of distributed storage, one associates a data file with a collection of $k$ symbols over $\mathbb{F}_q$. To this set, an additional $(n-k)$ parity symbols are added and the resulting set of $n$ code symbols stored over $n$ nodes or storage units.
The principle behind an LRC is that the $(r+1)$ code symbols corresponding to an erased code symbol along with the $r$ other code sy= mbols that are used to recover from erasure, form a nontrivial code on their own. Thus one may view each local code as the code that is obtained by restricting the parent code to the subset of $(r+1)$ code symbols involved in recovery from erasure.
LRCs can be divided into two sub classes, information-symbol locality in which only the information nodes have locality $r$ and the more general case of all-symbol locality, in which any failed node has locality $r$.
Locally Regenerating Code: Given that RGCs minimize repair bandwidth and an LRC minimizes the repair degree, a natural question that arises is whether it is possible to construct an error correcting code that minimizes both repair bandwidth and repair degree. Locally Regenerating Codes (LRGCs) are precisely such a class of codes. In an LRGC, the local codes are themselves RGCs.
Contributions: The contributions of the thesis are two constructions of an LRGC, both of which are optimal in the sense of having largest possible minimum distance:
• an explicit construction of an LRGC in which the local codes are MBR codes derived using the well-known product-matrix construction. Interestingly, it turns out that the construction also involves an LRC, one that is obtained by employing the Tamo-Barg (TB) construction of an all-symbol LRC.
• an explicit construction of an LRGC for a restricted range of parameters in which the local codes are MSR codes corresponding to the MSR Clay code. Here again, the construction makes use of the TB construction of an all-symbol locality LRC. The resulting LRGC is optimal in two respects as it achieves with equality, upper bounds on both minimum distance and code rate.
We note that while existing optimal LRGC constructions in the literature require field size that is exponential in the block length $n$, the code families presented in this thesis, require only linear field size.