|dc.description.abstract||Cache replacement policies can play a pivotal role in the overall performance of a system by preserving data locality and thus limiting the o -chip accesses. In a shared memory system, a cache coherence protocol is necessary to ensure correctness of data computations by maintaining the state of entries in the cache. In this work we attempt to build and investigate the effect of cache replacement policies using the information provided by cache coherence protocol states.
The cache coherence protocol states give us an idea about the state of entry with respect to other cores in the system. State based analysis of SPLASH-2 and PARSEC benchmark suites show that this information hints us towards the locality patterns of cache blocks, which can be used to prioritize the order of replacement of a cache states in a replacement policy. We model ten di erent cache state based replacement policies, three having xed priorities and seven whose priorities vary dynamically over the most recently used state. We compare these policies against the standard replacement policies (LRU, FIFO and Random) in terms of system performance and ease of implementation.
We develop our simulation framework using the Multi2Sim simulator, where we model cache state based replacement policies. We simulate SPLASH-2 and PARSEC benchmark suites over a variety of con gurations, where we vary the number of cores, associatively for each level of cache, private/shared L2 cache. We characterize the programs to find out critical components for performance. For an 8-core system we observe that the best case among these state based replacement policies shows marginal improvements in IPC over the Random and FIFO policies, falling slightly short of LRU.
We design the state based replacement policies using a smaller cache (CSL-cache), which is used to store the state information of the blocks in the main cache. The CSL cache communicates with the controller to provide the replacement entry. The complexity associated with the system is equal to FIFO and is independent of the associatively of the cache.||en_US