dc.description.abstract | Error correcting codes (ECCs) are essential to transmission and data storage sys-tems to protect the information from errors introduced by noisy communication channels. There are two main classes of ECCs, namely algebraic and iterative ECCs. While iterative ECCs like low-density parity-check (LDPC) codes provide improved performance in the waterfall region albeit exhibiting flooring effect for not so well-designed codes, algebraic ECCs like Bose–Chaudhuri–Hocquenghem (BCH) and Reed Solomon (RS) codes provide guaranteed error correction capability irrespective of the waterfall or error floor regions.
Due to recent advancements in higher-dimensional data storage technologies like shingled and 2-D magnetic recording (TDMR), 3-DNAND flash memories, and holographic memories, native 2-Dsignal processing and coding techniques are re-quired to overcome inter-symbol interference (ISI) and noise leading to 2-Dburst and random errors. With high data densities beyond 2 Tb/in2 in practical TDMR channels, reliable information storage and retrieval require highly efficient ECCs.
The primary motivation of this dissertation is to design efficient hardware architectures for error correcting codes pertaining to 1-Dand 2-Dstorage channels. The focus topics are as follows:
(i) First, we designed a high-throughput 1-DLDPC decoder using layered and non-layered min-sum algorithm based on non-uniform quantization on a field programmable gate array (FPGA) kit. Unlike the standard state-of-the-art uniform quantization used in virtually all decoder circuits, our non-uniform quantization technique achieves a slight performance improvement in the signal-to-noise ratio (SNR) using the same bit budget as the uniform case. Using 1 bit lesser than uniform quantization, it yields area savings for the block RAMs used for storing intermediate check node and variable node messages.
(ii) We proposed efficient encoding and decoding hardware architectures for (n,k), t-error correcting BCH product codes in the frequency domain. Using the properties of conjugate classes over a finite field, we reduced the algorithmic complexity of the encoder, leading to a significant reduction in the hardware complexity.
v
vi
A low-latency (2t + 2) decoder for the above encoder is also designed. For a particular case of n = 15 and t = 2, the architectures were implemented on a FPGA kit, giving high throughputs of 22.5 Gbps and 5.6 Gbps at 100 MHz for the encoder and decoder respectively.
(iii) We proposed fast and efficient hardware architectures for a 2-D BCH code of size n × n, with a quasi-cyclic burst error correction capability of t × t, in the frequency domain for data storage applications. A fully parallel encoder with the ability to produce an output every clock cycle was designed. Using conjugate class properties of finite fields, the algorithmic complexity of the encoder was significantly reduced, leading to a reduction in the number of gates by about 94% compared to the brute force implementation per 2-Dinverse discrete finite field Fourier transform (IDFFFT) point for a 15 × 15, t = 2, 2-DBCH code. We also designed a pipelined, low-latency decoder for the above encoder. The algorithmic complexities of various pipeline stages of the decoder were reduced significantly using finite field properties, reducing the space complexity of the entire decoder. For a particular case of n = 15 and t = 2, the architectures were implemented targeting a Kintex 7 KC-705 FPGA kit, giving high throughputs of 22.5 Gbps and 5.6 Gbps at 100 MHz for the encoder and decoder, respectively.
(iv) We developed an efficient design architecture for finding the roots of a bi-variate polynomial over GF(q) by extending the Chien search procedure to two-dimensions. The complexity of the Chien search was reduced to an order of the number of conjugacy classes over GF(qλ), leading to a significant reduction in the computational complexity. We provided an efficient design architecture for our algorithm towards a circuit realization, useful for decoding of 2-Dalgebraic ECCs.
v) Native 2-DLDPC codes provide 2-Dburst erasure correction capability and have promising applications in TDMR technology. Though carefully constructed rastered 1-DLDPC codes can provide 2-Dburst erasure correction, they are not as efficient as 2-Dnative codes constructed for handling 2-Dspan of burst erasures. Our contributions are two-fold: (a) We propose a new 2-DLDPC code with girth greater than 4 by generating a parity check tensor through stacking permutation tensors of size p×p×p along the i,j,k axes. The permutations are achieved through circular shifts on an identity tensor along different co-ordinate axes in such a way that it provides a burst erasure correction capability of at least p×p. (b) We propose a fast, efficient, and scalable hardware architecture for a parallel 2-DLDPC decoder
based on the proposed code construction for data storage applications. Through efficient indexing of the received messages in a RAM, we propose novel routing mechanisms for messages between the check nodes and variable nodes through a set of two barrel shifters, producing shifts along two axes. Through simulations, we show that the performance of the proposed 2-D LDPC codes match a 1-DQC-LDPC code, with a sharp waterfall drop of 3-4 orders of magnitude over ∼0.3 dB, for random errors over code sizes of ∼32 Kbits or equivalently ∼180×180 2-Darrays. Further, we prove that the proposed native 2-DLDPC codes outperform their 1-Dcounterparts in terms of 2-Dcluster erasure correction ability. For p = 16 and code arrays of size 48 × 48, we implemented the proposed design architecture on a Kintex-7 KC-705 FPGA kit, achieving a significantly high worst case throughput of 12.52 Gbps at a clock frequency of 163 MHz. | en_US |