Thread Criticality Assisted Replication and Migration for Chip Multiprocessor Caches

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journal

2 Scopus Citations
View graph of relations


Related Research Unit(s)


Original languageEnglish
Article number7931700
Pages (from-to)1747-1762
Journal / PublicationIEEE Transactions on Computers
Issue number10
Online published17 May 2017
Publication statusPublished - Oct 2017


Non-Uniform Cache Architecture (NUCA) is a viable solution to mitigate the problem of large on-chip wire delay due to the rapid increase in the cache capacity of chip multiprocessors (CMPs). Through partitioning the last-level cache (LLC) into smaller banks connected by on-chip network, the access latency will exhibit non-uniform distribution. Various works have well explored the NUCA design, including block migration, block replication and block searching. However, all of the previous mechanisms designed for NUCA are thread-oblivious when multi-threaded applications are deployed on CMP systems. Due to the interference on shared resources, threads often demonstrate unbalanced progress wherein the lagging threads with slow progress are more critical to overall performance. In this paper, we propose a novel NUCA design called thread Criticality Assisted Replication and Migration (CARM). CARM exploits the runtime thread criticality information as hints to adjust the block replication and migration in NUCA. Specifically, CARM aims at boosting parallel application execution through prioritizing block replication and migration for critical threads. Full-system experimental results show that CARM reduces the execution time of a set of PARSEC workloads by 13.7% and 6.8% on average compared with the tradition D-NUCA and Re-NUCA respectively. Moreover, CARM also consumes much less energy compared with the evaluated schemes.

Research Area(s)

  • Chip multiprocessor, migration, non-uniform cache, replication, thread criticality