|dc.description.abstract||This thesis looks at hardware algorithms that help reduce dynamic power dissipation in video encoder applications. Computational complexity of motion estimation and the data traffic between external memory and the video processing engine are two main reasons for large power dissipation in video encoders. While motion estimation may consume 50% to 70% of total video encoder power, the power dissipated in external memory such as the DDR SDRAM can be of the order of 40% of the total system power. Reducing power dissipation in video encoders is important in order to improve battery life of mobile devices such as the smart phones and digital camcorders. We propose hardware algorithms which extract only the important features in the video data to reduce the complexity of computations, communications and storage, thereby reducing average power dissipation. We apply this concept to design hardware algorithms for optimizing motion estimation matching complexity, and reference frame storage and access from the external memory. In addition, we also develop techniques to reduce searching complexity of motion estimation.
First, we explore a set of adaptive algorithms that reduce average power dissipated due to motion estimation. We propose that by taking into account the macro-block level features in the video data, the average matching complexity of motion estimation in terms of number of computations in real-time hardwired video encoders can be significantly reduced when compared against traditional hardwired implementations, that are designed to handle most demanding data sets. Current macro-block features such as pixel variance and Hadamard transform coefficients are analyzed, and are used to adapt the matching complexity. The macro-block is partitioned based on these features to obtain sub-block sums, which are used for matching operations. Thus, simple macro-blocks, without many features can be matched with much less computations compared to the macro-blocks with complex features, leading to reduction in average power dissipation. Apart from optimizing the matching operation, optimizing the search operation is a powerful way to reduce motion estimation complexity. We propose novel search optimization techniques including (1) a center-biased search order and (2) skipping unlikely search positions, both applied in the context of real time hardware
implementation. The proposed search optimization techniques take into account and are compatible with the reference data access pattern from the memory as required by the hardware algorithm. We demonstrate that the matching and searching optimization techniques together achieve nearly 65% reduction in power dissipation due to motion estimation, without any significant degradation in motion estimation quality.
A key to low power dissipation in video encoders is minimizing the data traffic between the external memory devices such as DDR SDRAM and the video processor. External memory power can be as high as 50% of the total power budget in a multimedia system. Other than the power dissipation in external memory, the amount of data traffic is an important parameter that has significant impact on the system cost. Large memory traffic necessitates high speed external memories, high speed on-chip interconnect, and more parallel I/Os to increase the memory throughput. This leads to higher system cost. We explore a lossy, scalar quantization based reference frame compression technique that can be used to reduce the amount of reference data traffic from external memory devices significantly. In this scheme, the quantization is adapted based on the pixel range within each block being compressed. We show that the error introduced by the scalar quantization is bounded and can be represented by smaller number of bits compared to the original pixel. The proposed reference frame compression scheme uses this property to minimize the motion compensation related traffic, thereby improving the compression scheme efficiency. The scheme maintains a fixed compression ratio, and the size of the quantization error is also kept constant. This enables easy storage and retrieval of reference data. The impact of using lossy reference on the motion estimation quality is negligible. As a result of reduction in DDR traffic, the DDR power is reduced significantly. The power dissipation due to additional hardware required for reference frame compression is very small compared to the reduction in DDR power. 24% reduction in peak DDR bandwidth and 23% net reduction in average DDR power is achieved. For video sequences with larger motion, the amount of bandwidth reduction is even higher (close to 40%) and reduction in power is close to 30%.||en_US