Compiler Transformations For Improving The Performance Of Software Transactional Memory

Mannarswamy, Sandya S

dc.contributor.advisor	Govindarajan, R
dc.contributor.author	Mannarswamy, Sandya S
dc.date.accessioned	2013-03-27T05:54:37Z
dc.date.accessioned	2018-07-31T04:38:20Z
dc.date.available	2013-03-27T05:54:37Z
dc.date.available	2018-07-31T04:38:20Z
dc.date.issued	2013-03-27
dc.date.submitted	2011
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/1955
dc.identifier.abstract	https://etd.iisc.ac.in/static/etd/abstracts/2533/G24967-Abs.pdf	en_US
dc.description.abstract	Expressing synchronization using traditional lock based primitives has been found to be both error-prone and restrictive. Hence there has been considerable research work to develop scalable and programmer-friendly alternatives to lock-based synchronization. Atomic sections have been proposed as a programming idiom for expressing synchronization at a higher-level of abstraction than locks. One way of supporting atomic sections in software is to rely on an underlying Software Transactional Memory (STM) implementation. While STM offers the promise of being a programming paradigm which is less error-prone and more programmer friendly compared to traditional lock-based synchronization, it also needs to be competitive in performance in order for it to be adopted in mainstream software. Unfortunately STMs do not meet the performance goals and are known to incur excessive performance overheads. Prior work by other researchers and our performance analysis of STM applications show that conflicts and the resulting aborts are a major performance bottleneck for STM applications. Second we find that, supporting fine-grained optimistic concurrency can have significant impact on the cache behavior of applications running on STM and hence can adversely affect STM performance. Our systematic quantitative analysis of the cache behavior of STM applications as well as prior work on qualitative analysis of STM overheads show that cache overheads constitute a major performance bottleneck for STM applications. Hence in this thesis, we focus on addressing these two major STM performance bottlenecks. Current STM implementations are typically application unaware in that they do not analyze the application and use that knowledge to improve the application performance on STM. Closer integration of transactions with the programming languages opens up the possibility of using the compiler to analyze STM applications and using that knowledge to perform application code transformations to improve the application performance on STM automatically and in a manner transparent to the programmer. This motivated us to address the two major STM performance bottlenecks namely poor cache performance and performance penalty due to aborts, by compiler transformations. In order to pinpoint the cache bottlenecks of STM, we perform a detailed experimental evaluation of the cache behavior of STM applications and quantify the impact of the different STM factors on the cache misses experienced by the applications. We propose a set of compiler transformations targeted to address the cache performance bottlenecks identified by our analysis. Next we turn our attention to compiler analysis and transformations targeted at reducing the performance overheads due to transactional aborts, effectively utilizing the compiler’s knowledge of the data access patterns of the application. Since not all applications are designed with optimistic concurrency in mind, real world applications typically contain certain atomic sections which are not amenable to STM’s optimistic concurrency control and hence suffer from excessive transactional abort overheads. We propose two compiler techniques for handling such atomic sections. Another major cause of transactional conflicts leading to unnecessary aborts is the uniform granularity access tracking scheme employed by STM implementations. Using a single uniform access tracking granularity leads to poor lock assignment by STM. We propose techniques which use compiler’s knowledge of an application to improve the application unaware lock assignment made by the STM. Last as transactional abort overheads impact STM performance adversely, we propose a compiler-based approach to reduce the transactional abort overheads by reconciling certain kinds of transactions instead of aborting them and then performing a complete re-execution. We show that our combined set of compiler transformations are effective in improving the performance of a set of STAMP benchmarks by reducing the execution time by 7.48% to 54.82%, aborts by 8.98% to 56.61% and the average D-cache miss latency by up to 33.51%.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	G24967	en_US
dc.subject	Compiler Transformation	en_US
dc.subject	Software Transactional Memory (STM)	en_US
dc.subject.classification	Computer Science	en_US
dc.title	Compiler Transformations For Improving The Performance Of Software Transactional Memory	en_US
dc.type	Thesis	en_US
dc.degree.name	PhD	en_US
dc.degree.level	Doctoral	en_US
dc.degree.discipline	Faculty of Engineering	en_US

Files in this item

Name:: G24967.pdf
Size:: 1.492Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Computer Science and Automation (CSA) [542]

Show simple item record