Integrating Read-Copy-Update Synchronization and Memory Allocation

Prasad, Aravinda

dc.contributor.advisor	Gopinath, K
dc.contributor.advisor	Pandit, Vinayaka D
dc.contributor.author	Prasad, Aravinda
dc.date.accessioned	2021-09-29T09:14:44Z
dc.date.available	2021-09-29T09:14:44Z
dc.date.submitted	2018
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/5359
dc.description.abstract	The evolution of multicore systems with thousands of cores has led to the exploration of non-traditional procrastination-based synchronization techniques such as Read-Copy- Update (RCU). Deferred destruction is the fundamental technique used in such tech- niques where writers in order to synchronize with the readers defer the freeing of the objects until the completion of all pre-existing readers. This writer-wait time period is referred to as a grace period (GP). The readers, as a consequence, need not explicitly synchronize with the writers resulting in low overhead, wait free read-side synchroniza- tion primitives. We observe that the deferred destruction of objects leads to newer and complex forms of interactions between the synchronization technique and the memory allocator. We study and analyze the impact of such interactions in the operating system kernels for enterprise workloads, high-performance computing environments, idle systems and virtu- alized environments. We explore different solutions to efficiently handle deferred destruc- tions where our general solution integrates synchronization technique with the memory allocator. Our general solution further exploits interaction between the synchronization technique and memory allocator to optimize both of them. In the first part we analyze the implication of deferred destruction in enterprise envi- ronments. We observe that RCU determines when the deferred object is safe to reclaim and when it is actually reclaimed. As a result, the memory reclamation of the deferred objects are completely oblivious of the memory allocator state leading to poor memory allocator performance. Furthermore, we observe that the deferred objects provide hints about the future that inform memory regions that are about to be freed. Although useful, hints are not exploited as the deferred objects are not \visible" to memory allo- cators. We design Prudence, a new dynamic memory allocator, that is tightly integrated with RCU to ensure visibility of deferred objects to the memory allocator. Prudence exploits optimizations based on the hints about the future during important state tran- sitions. Our evaluation in the Linux kernel shows that Prudence performs 3.9 to 28 better in micro-benchmarks compared to SLUB allocator. It also improves the overall performance perceptibly (4%-18%) for a mix of widely used synthetic and application benchmarks. In the second part we analyze the implication of deferred destruction in idle and High- performance computing (HPC) environments where the amount of memory waiting for reclamation in a grace period is negligible due to limited OS kernel activity. The default grace period computation is not only futile but also detrimental as the CPU cycles consumed to compute a grace period leads to jitter in HPC and frequent CPU wake-ups in idle environments. We design a frugal approach to reduce RCU grace period overhead that reduces the number of grace periods by 68% to 99% and the CPU time consumed by grace periods by 39% to 99% for NAS parallel benchmarks and idle systems. Finally, we analyze the implication of deferred destruction in a virtualized environ- ment. Preemption of RCU-readers can cause multi-second latency spikes and can in- crease peak memory footprint inside VMs which in turn can negate the server con- solidation bene fits of virtualization. Although preemption of lock holders in VMs has been well-studied, the corresponding solutions do not apply to RCU due to its exceed- ingly lightweight read-side primitives. We present the first evaluation of RCU-reader preemption in a virtualized environment. Our evaluation shows 50% increase in the peak memory footprint and 155% increase in fragmentation for a microbenchmark. We propose Conflux, a confluence of three mutually independent solutions that enhances the RCU synchronization technique, memory allocator and the hypervisor to efficiently handle the RCU-reader preemption problem.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	;G29408
dc.rights	I grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation	en_US
dc.subject	Read-Copy- Update	en_US
dc.subject	grace period	en_US
dc.subject	memory allocator	en_US
dc.subject	Prudence	en_US
dc.subject.classification	Research Subject Categories::TECHNOLOGY::Information technology::Computer science	en_US
dc.title	Integrating Read-Copy-Update Synchronization and Memory Allocation	en_US
dc.type	Thesis	en_US
dc.degree.name	PhD	en_US
dc.degree.level	Doctoral	en_US
dc.degree.grantor	Indian Institute of Science	en_US
dc.degree.discipline	Engineering	en_US

Files in this item

Name:: G29408.pdf
Size:: 4.831Mb
Format:: PDF
Description:: Thesis full text

View/Open

This item appears in the following Collection(s)

Computer Science and Automation (CSA) [394]

Show simple item record