Show simple item record

dc.contributor.advisorNandy, S K
dc.contributor.authorNalesh, S
dc.date.accessioned2018-11-22T09:19:36Z
dc.date.available2018-11-22T09:19:36Z
dc.date.submitted2018
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/4164
dc.description.abstractWith supply voltage no longer scaling down at the same rate as transistor feature size, keeping power dissipation to practical levels while maximizing performance is becoming a challenge in future computing systems. Increasing performance per watt for target applications is critical. Heterogeneous computing systems which consist of General Purpose Processors (GPPs), Graphic Processing Units (GPUs) and application specific accelerators can provide improved performance while keeping power dissipation at a realistic level. Application specific accelerators give the best performance per watt for a given application, but their lack of flexibility prevents their applicability in case of any small modification in the application or for a closely related application. In such scenarios, Coarse Grained Reconfigurable arrays or CGRAs are drawing increasing attention due to their promise of providing more flexibility than application specific accelerators, but with better energy efficiency than GPPs. One key feature of the majority of CGRAs is to naturally layout computational data paths in space, so as to avoid the hardware complexity associated with general purpose processor pipelines. This makes CGRAs more energy efficient when compared to GPPs. However, existing compilation frameworks for CGRAs are targeted towards maximizing performance for a given application kernel while neglecting power dissipation. While the very nature of CGRAs make these kernels run at lower power compared to the GPPs, existing techniques do not attempt to get the least power footprints for these kernels on the CGRA. With power dissipation becoming critical, CGRA compilation techniques should try to optimize the performance for a given kernel while simultaneously optimizing for power dissipation. Extracting parallelism inherent in kernels and exposing it efficiently to the CGRA is an effective way to achieve maximum performance at minimum power dissipation. This thesis presents a CGRA targeted for realizing kernels specified as function compositions. Function composition is defined as applying one function to the results of another to form a new function. A functional style of programming is more effective in expressing parallelism compared to imperative style and is better suited for kernels targeting CGRAs. The proposed CGRA consists of a set of reconfigurable datapaths called HyperCells which can be stitched together to form a single datapath of required granularity as dictated by the targeted kernel. We call this CGRA, a Coarse Grained Composable Reconfigurable Array or CGCRA. We also propose a synthesis methodology for mapping kernels to the CGRA, for a given performance while minimizing power dissipation. A comprehensive throughput and power model for the CGCRA proposed here enables accurate estimation of performance and energy during synthesis. An RTL prototype for the proposed CGRA has been developed and synthesized to gate level netlist using Cadence RTL Compiler with 40 nm LowK (RVT) standard cell library from Faraday Technology. A 5X9 array with 32 HyperCells has an area of 32.27 mm2 and can operate at a maximum clock frequency of 275 MHz. This gives a theoretical peak performance of 220 GFLOPS. A few application kernels from signal processing, machine learning, and HPC domains have been mapped to the CGCRA using the proposed synthesis methodology. Estimated power efficiency for these kernels falls within a range of 9 to 19 GFLOPS/Watts with an average 13.8 GFLOPS/Watts. Higher performance is observed for kernels with significant data reuse with a maximum observed performance of 120 GFLOPS which is 55% of the theoretical peak.en_US
dc.language.isoen_USen_US
dc.relation.ispartofseries;G28729
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertationen_US
dc.subjectHyperCellsen_US
dc.subjectCoarse Grained Composable Recon gurable Arrayen_US
dc.subjectCoarse Grained Reconfigurable arraysen_US
dc.subject.classificationResearch Subject Categories::TECHNOLOGY::Electrical engineering, electronics and photonics::Electronicsen_US
dc.titleEnergy Aware Synthesis of Accelerators on a Network of HyperCellsen_US
dc.typeThesisen_US
dc.degree.namePhDen_US
dc.degree.levelDoctoralen_US
dc.degree.grantorIndian Institute of Scienceen_US
dc.degree.disciplineEngineeringen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record