A Case for Protecting Huge Pages from the Kernel
Abstract
Modern architectures support multiple size pages to facilitate applications that use large chunks of contiguous memory either for buffer allocation, application specific memory management, in-memory caching or garbage collection. Most general purpose processors support larger page sizes, for e.g. x86 architecture supports 2MB and 1GB pages while PowerPC architecture supports 64KB, 16MB, 16GB pages. Such larger size pages are also known as superpages or huge pages. With the help of huge pages TLB reach can be increased significantly. The Linux kernel can transparently use these huge pages to significantly bring down the cost of TLB translations. With Transparent Huge Pages (THP) support in Linux kernel the end users or the application developers need not make any change to their application.
Memory fragmentation which has been one of the classical problems in computing systems for decades is a key problem for the allocation of huge pages. Ubiquitous huge page support across architectures makes effective fragmentation management even more critical for modern systems. Applications tend to stress system TLB in the absence of huge pages, for virtual to physical address translation, which adversely affects performance/energy characteristics in long running systems. Since most kernel pages tend to be unmovable, fragmentation created due to their misplacement is more problematic and nearly impossible to recover with memory compaction.
In this work, we explore physical memory manager of Linux and the interaction of kernel page placement with fragmentation avoidance and recovery mechanisms. Our analysis reveals that not only a random kernel page layout thwarts the progress of memory compaction; it can actually induce more fragmentation in the system. To address this problem, we propose a new allocator which takes special care for the placement of kernel pages. We propose a new region which represents memory area having kernel as well as user pages. Using this new region we introduce a staged allocator which with change in fragmentation level adapts and optimizes the kernel page placement. Later we introduce Illuminator which with zero overhead outperforms default kernel in terms of huge page allocation success rate and compaction overhead with respect to each huge page. We also show that huge page allocation is not a one dimensional problem but a two fold concern with how the fragmentation recovery mechanism may potentially interfere with the page clustering policy of allocator and worsen the fragmentation.
Our results show that with effective kernel page placements the mixed page block counts reduces upto 70%, which allows our system to allocate 3x-4x huge pages than the default Kernel. Using these additional huge pages we show up to 38% improvement in terms of energy consumed and reduction in execution time up to 39% on standard benchmarks.