Efficient Constructions of Streaming Codes
Abstract
Streaming codes are a class of erasure codes that operate on a stream of packets and enable recovery of dropped or erased packets under a decoding-delay constraint. The primary focus of this thesis is on providing constructions and performance bounds for streaming codes in some settings of practical interest. A secondary focus of this thesis is on coded distributed computation. In the streaming code literature, a sliding window (SW) channel model is often adopted, under which, within any SW of fixed time duration, the channel is permitted to introduce either a single erasure burst or, else, a set of arbitrary erasures. Erasure patterns conforming to this constraint are termed admissible erasure patterns. A streaming code operating on this channel is required to recover from all admissible erasure patterns and, furthermore, do so under a decoding delay constraint. Prior rate-optimal constructions of such streaming codes were mostly based on the diagonal embedding (DE) of codewords of a scalar block code within the packet stream and required a field size that was quadratic in the delay parameter. This thesis introduces a variant of DE called staggered diagonal embedding (SDE), in which codewords of scalar block code are embedded diagonally with carefully chosen gaps in the packet stream. The SDE approach helps to reduce the impact of burst erasures on the embedded scalar block code and, thereby, often permits the construction of rate-optimal streaming codes having linear field size. The maximum possible rate of streaming codes employing a generalization of SDE and that are based on the simple-to-implement class of MDS codes is also determined. With respect to the DE framework, a novel rate-optimal streaming code construction is presented, that covers all possible parameters and where the field size, though still quadratic, is the smallest among all currently-known explicit and general rate-optimal constructions. In a different setting, the thesis examines streaming codes where the SW channel model permits only arbitrary erasures to take place within each SW, and where this time, the goal is the minimization of the average decoding delay. Borrowing from principles of codes for distributed storage, the notion of locally recoverable streaming codes is introduced, which not only operates under the usual decoding delay constraint, it guarantees, in addition, a significantly smaller decoding delay in the more frequently-occurring case of single-packet erasure. The maximum possible rate of locally recoverable streaming codes is determined in this thesis by presenting rate-optimal constructions for all possible parameters. With the same general aim of minimizing average decoding delay, but in a slightly different direction, streaming codes operating over a symbol-erasure channel and that are optimal with respect to a metric called information-debt are investigated. A single path from source to destination will be unable to guarantee reliable erasure recovery in the presence of events such as deep fades. With this in mind, the thesis deals with multipath streaming codes, i.e., streaming codes designed for the setting when there are multiple source-destination paths, as is the case with, for example, 5G dual connectivity. Leveraging the techniques developed for single-path streaming codes, constructions and rate-bounds are derived here for the multipath setting.
The thesis also contains results on coded distributed computation. With the amount of data being generated increasing rapidly, large-scale distributed computing has become very relevant. In this thesis, two different coded distributed computation settings are investigated, namely coded MapReduce and secure distributed matrix multiplication. Coded MapReduce is a technique that trades an increase in computation for a reduced communication load, to achieve a speedup in settings where communication is the bottleneck. The placement-delivery-array is a structure initially defined in the context of coded caching to develop schemes with smaller sub-packetization. Motivated by this, a method is presented to construct coded MapReduce schemes, requiring a small number of subfiles, based on a slightly redefined version of placement-delivery-array. For the coded MapReduce framework where each Reduce function is computed by more than one node, an optimal coded shuffling scheme based entirely on XOR operations is presented, whereas the previously known scheme required operations over a larger finite field. The problem of constructing polynomial codes for secure distributed matrix multiplication can be reduced to a combinatorial problem of filling up the entries of a table subject to certain restrictions. Such tables are called degree tables, and degree tables outperforming the best-known degree tables for certain parameter ranges are constructed.