Show simple item record

dc.contributor.advisorNandy, S K
dc.contributor.advisorRaha, S
dc.contributor.advisorDatta, Amitava
dc.contributor.advisorNarayan, Ranjani
dc.contributor.authorRamesh, Chinthala
dc.date.accessioned2019-08-30T09:03:30Z
dc.date.available2019-08-30T09:03:30Z
dc.date.submitted2017
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/4276
dc.description.abstractSparse Basic Linear Algebra Subroutines (Sparse BLAS) is an important library. Sparse BLAS includes three levels of subroutines. Level 1, Level2 and Level 3 Sparse BLAS routines. Level 1 Sparse BLAS routines do computations over sparse vector and spare/dense vector. Level 2 deals with sparse matrix and vector operations. Level 3 deals with sparse matrix and dense matrix operations. The computations of these Sparse BLAS routines on General Purpose Processors (GPPs) not only suffer from less utilization of hardware resources but also takes more compute time than the workload due to poor data locality of sparse vector/matrix storage formats. In the literature, tremendous efforts have been put into software to improve these Sparse BLAS routines performance on GPPs. GPPs best suit for applications with high data locality, whereas Sparse BLAS routines operate on applications with less data locality hence, GPPs performance is poor. Various Custom Function Units (Hardware Accelerators) are proposed in the literature and are proved to be efficient than soft wares which tried to accelerate Sparse BLAS subroutines. Though existing hardware accelerators improved the Sparse BLAS performance compared to software Sparse BLAS routines, there is still lot of scope to improve these accelerators. This thesis describes both the existing software and hardware software co-designs (HW/SW co-design) and identifies the limitations of these existing solutions. We propose a new sparse data representation called Sawtooth Compressed Row Storage (SCRS) and corresponding SpMV and SpMM algorithms. SCRS based SpMV and SpMM are performing better than existing software solutions. Even though SCRS based SpMV and SpMM algorithms perform better than existing solutions, they still could not reach theoretical peak performance. The knowledge gained from the study of limitations of these existing solutions including the proposed SCRS based SpMV and SpMM is used to propose new HW/SW co-designs. Software accelerators are limited by the hardware properties of GPPs, and GPUs itself, hence, we propose HW/SW co-designs to accelerate few basic Sparse BLAS operations (SpVV and SpMV). Our proposed Parallel Sparse BLAS HW/SW co-design achieves near theoretical peak performance with reasonable hardware resources.en_US
dc.language.isoen_USen_US
dc.relation.ispartofseriesG28775;
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertationen_US
dc.subjectSparse Matrix Storage Formatsen_US
dc.subjectHardware-Software Codesign Acceleratorsen_US
dc.subjectSparse BLASen_US
dc.subjectHardware Acceleratoren_US
dc.subjectSawtooth Compressed Row Storageen_US
dc.subjectSparse Vector Vector Multiplicationen_US
dc.subjectSparse Matrix Matrix Multiplicationen_US
dc.subjectSparse Matrix Vector Multiplicationen_US
dc.subjectCompressed Row Storageen_US
dc.subjectSparse Basic Linear Algebra Subroutinesen_US
dc.subjectSpMV Multiplicationen_US
dc.subjectSpMM Multiplicationen_US
dc.subject.classificationNano Science and Engineeringen_US
dc.titleHardware-Software Co-Design Accelerators for Sparse BLASen_US
dc.typeThesisen_US
dc.degree.namePhDen_US
dc.degree.levelDoctoralen_US
dc.degree.grantorIndian Institute of Scienceen_US
dc.degree.disciplineEngineeringen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record