Compiling For Coarse-Grained Reconfigurable Architectures Based On Dataflow Execution Paradigm

Alle, Mythri

dc.contributor.advisor	Nandy, S K
dc.contributor.author	Alle, Mythri
dc.date.accessioned	2015-07-24T07:23:46Z
dc.date.accessioned	2018-07-31T05:09:08Z
dc.date.available	2015-07-24T07:23:46Z
dc.date.available	2018-07-31T05:09:08Z
dc.date.issued	2015-07-24
dc.date.submitted	2012
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/2453
dc.identifier.abstract	https://etd.iisc.ac.in/static/etd/abstracts/3167/G25487-Abs.pdf	en_US
dc.description.abstract	Coarse-Grained Reconfigurable Architectures(CGRAs) can be employed for accelerating computational workloads that demand both flexibility and performance. CGRAs comprise a set of computation elements interconnected using a network and this interconnection of computation elements is referred to as a reconfigurable fabric. The size of application that can be accommodated on the reconfigurable fabric is limited by the size of instruction buffers associated with each Compute element. When an application cannot be accommodated entirely, application is partitioned such that each of these partitions can be executed on the reconfigurable fabric. These partitions are scheduled by an orchestrator. The orchestrator employs dynamic dataflow execution paradigm. Dynamic dataflow execution paradigm has inherent support for synchronization and helps in exploitation of parallelism that exists across application partitions. In this thesis, we present a compiler that targets such CGRAs. The compiler presented in this thesis is capable of accepting applications specified in C89 standard. To enable architectural design space exploration, the compiler is designed such that it can be customized for several instances of CGRAs employing dataflow execution paradigm at the orchestrator. This can be achieved by specifying the appropriate configuration parameters to the compiler. The focus of this thesis is to provide efficient support for various kinds of parallelism while ensuring correctness. The compiler is designed to support fine-grained task level parallelism that exists across iterations of loops and function calls. Additionally, compiler can also support pipeline parallelism, where a loop is split into multiple stages that execute in a pipelined manner. The prototype compiler, which targets multiple instances of a CGRA, is demonstrated in this thesis. We used this compiler to target multiple variants of CGRAs employing dataflow execution paradigm. We varied the reconfigur-able fabric, orchestration mechanism employed, size of instruction buffers. We also choose applications from two different domains viz. cryptography and linear algebra. The execution time of the CGRA (the best among all instances) is compared against an Intel Quad core processor. Cryptography applications show a performance improvement ranging from more than one order of magnitude to close to two orders of magnitude. These applications have large amounts of ILP and our compiler could successfully expose the ILP available in these applications. Further, the domain customization also played an important role in achieving good performance. We employed two custom functional units for accelerating Cryptography applications and compiler could efficiently use them. In linear algebra kernels we observe multiple iterations of the loop executing in parallel, effectively exploiting loop-level parallelism at runtime. Inspite of this we notice close to an order of magnitude performance degradation. The reason for this degradation can be attributed to the use of non-pipelined floating point units, and the delays involved in accessing memory. Pipeline parallelism was demonstrated using this compiler for FFT and QR factorization. Thus, the compiler is capable of efficiently supporting different kinds of parallelism and can support complete C89 standard. Further, the compiler can also support different instances of CGRAs employing dataflow execution paradigm.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	G25487	en_US
dc.subject	Coarse-Grained Reconfigurable Architecture (CGRA)	en_US
dc.subject	Reconfigurable Fabric	en_US
dc.subject	Dataflow Execution	en_US
dc.subject	Compilers (Computer Programs)	en_US
dc.subject	Computer Architecture	en_US
dc.subject	Reconfigurable Architectures	en_US
dc.subject	Coarse-Grained Reconfigurable Architectures (CGRAs)	en_US
dc.subject	Run Time Reconfigurable Platform	en_US
dc.subject	Runtime Reconfigurable Platform	en_US
dc.subject	Runtime Reconfigurable Hardware	en_US
dc.subject	Coarse Grained Computation	en_US
dc.subject.classification	Computer Science	en_US
dc.title	Compiling For Coarse-Grained Reconfigurable Architectures Based On Dataflow Execution Paradigm	en_US
dc.type	Thesis	en_US
dc.degree.name	PhD	en_US
dc.degree.level	Doctoral	en_US
dc.degree.discipline	Faculty of Engineering	en_US

Files in this item

Name:: G25487.pdf
Size:: 1.222Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Supercomputer Education and Research Centre (SERC) [116]

Show simple item record

Compiling For Coarse-Grained Reconfigurable Architectures Based On Dataflow Execution Paradigm

Files in this item

This item appears in the following Collection(s)

Related items

A Coarse Grained Reconfigurable Architecture Framework Supporting Macro-Dataflow Execution ﻿

An Accelerator for Machine Learning Based Classifiers ﻿

Algorithm-Architecture Co-Design for Dense Linear Algebra Computations ﻿

A Coarse Grained Reconfigurable Architecture Framework Supporting Macro-Dataflow Execution

An Accelerator for Machine Learning Based Classifiers

Algorithm-Architecture Co-Design for Dense Linear Algebra Computations