arch-beer

Weekly Reading
 


Aniket, Guru, Chenyu, Richard and Dong will share their internship experiences

Sam is presenting...


Alok Garg, M.Wasiur Rashid, Michael Huang
"Slackened Memory Dependence Enforcement: Combining Opportunistic Forwarding with Decoupled verification"
ISCA 2006
PDF copy (accessible within GT network only)


An efficient mechanism to track and enforce memory dependences is crucial to an out-of-order microprocessor. The conventional approach of using cross-checked load queue and store queue, while very effective in earlier processor incarnations, suffers from scalability problems in modern high-frequency designs that rely on buffering many in-instructions to exploit instruction-level parallelism. In this paper, we make a case for a very different approach to dynamic conventional exact disambiguation strategy and adopt an opportunistic method: we allow loads and stores to access an L0 cache as they are issued out of program order, hoping that with such a laissez-faire approach, most loads actually obtain the right value. To guarantee correctness, they execute a second time in program order to access the nonspeculative L1 cache. A discrepancy between the two executions triggers a replay. Such a design completely eliminates the necessity of real-time violation detection and thus avoids the conventional approach's complexity and the associated scalability issue. We show that even a simplistic design can provide similar performance level achieved with a conventional queue-based approach with optimisticallysized queues. When simple, optional optimizations are applied, the performance level is close to that achieved with ideally-sized queues.