Coding Schemes For Distributed Subspace Computation, Distributed Storage And Local Correctability

Vadlamani, Lalitha

View/Open

G26724.pdf (3.619Mb)

Date

2017-07-24

Author

Vadlamani, Lalitha

Metadata

Show full item record

Abstract

In this thesis, three problems have been considered and new coding schemes have been devised for each of them. The first is related to distributed function computation, the second to coding for distributed storage and the final problem is based on locally correctable codes. A common theme of the first two problems considered is distributed computation. The first problem is motivated by the problem of distributed function computation considered by Korner and Marton, where the goal is to compute XOR of two binary sources at the receiver. It has been shown that linear encoders give better sum rates for some source distributions as compared to the usual Slepian-Wolf scheme. We generalize this distributed function computation setting to the case of more than two sources and the receiver is interested in computing multiple linear combinations of the sources. Consider `m' random variables each of which takes values from a finite field and are associated with a certain joint probability distribution. The receiver is interested in the lossless computation of `s' linear combinations of the m random variables. By considering the set of all linear combinations of m random variables as a vector space V , this problem can be interpreted as a subspace-computation problem. For this problem, we develop three increasingly refined approaches, all based on linear encoders. The first two approaches which are termed as common code approach and selected subspace approach, use a common matrix to encode all the sources. In the common code approach, the desired subspace W is computed at the receiver, whereas in the selected subspace approach, possibly a larger subspace U which contains the desired subspace is computed. The larger subspace U which gives the minimum sum rate itself is based on a decomposition of vector space V into a chain of subspaces. The chain of subspaces is determined by the joint probability distribution of m random variables and a notion of normalized measure of entropy. The third approach is a nested code approach, where all the encoding matrices are nested and the same subspace U which is identified in the selected subspace approach is computed. We characterize the sum rates under all the three approaches. The sum rate under nested code approach is no larger than both selected subspace approach and Slepian-Wolf approach. For a large class of joint distributions and subspaces W , the nested code scheme is shown to improve upon Slepian-Wolf scheme. Additionally, a class of source distributions and subspaces are identified, for which the nested code approach is sum-rate optimal. In the second problem, we consider a distributed storage network, where data is stored across nodes in a network which are failure-prone. The goal is to store data reliably and efficiently. For a required level of reliability, it is of interest to minimise storage overhead and also of interest to perform node repair efficiently. Conventionally replication and maximum distance separable (MDS) codes are employed in such systems. Though replication is very efficient in terms of node repair, the storage overhead is high. MDS codes have low storage overhead but even the repair of a single failed node requires contacting a large number of nodes and downloading all their data. We consider two coding solutions that have recently been proposed, which enable efficient node repair in case of single node failure. The first solution called regenerating codes seeks to minimize the amount of data downloaded for node repair, while codes with locality attempt to minimize the number of helper nodes accessed. We extend these results in two directions. In the first one, we introduce the notion of codes with locality where the local codes have minimum distance more than 2 and hence can recover a code symbol locally even in the presence of multiple erasures. These codes are termed as codes with local erasure correction. We say that a code has information locality if there exists a set of message symbols, each of which is covered by local codes. A code is said to have all-symbol locality if all the code symbols are covered by local codes. An upper bound on the minimum distance of codes with information locality is presented and codes that are optimal with respect to this bound are constructed. We make a connection between codes with local erasure correction and concatenated codes. The second direction seeks to build codes that combine the advantages of both codes with locality as well as regenerating codes. These codes, termed here as codes with local regeneration, are codes with locality over a vector alphabet, in which the local codes themselves are regenerating codes. There are two well known classes of regenerating codes known as minimum storage regenerating (MSR) codes and minimum bandwidth regenerating (MBR) codes. We derive two upper bounds on the minimum distance of vector-alphabet codes with locality, one for the case when the local codes are MSR codes and the second for the case when the local codes are MBR codes. We also provide several optimal constructions of both classes of codes which achieve their respective minimum distance bounds with equality. The third problem deals with locally correctable codes. A block code of length `n' is said to be locally correctable, if there exists a randomized algorithm such that any one of the coordinates of the codeword can be recovered by querying at most `r' coordinates, even in presence of some fraction of errors. We study the local correctability of linear codes whose duals contain 4-designs. We also derive a bound relating `r' and fraction of errors that can be tolerated, when each instance of the randomized algorithm is `t'-error correcting instead of simple parity computation.

URI

https://etd.iisc.ac.in/handle/2005/2646

Collections

Electrical Communication Engineering (ECE) [405]