[INT] James E. Smith and Andrew R. Pleszkun, "Implementing
Precise Interrupts in Pipelined Processors" IEEE Transactions on
Computers, vol. 37, NO. 5 May 1988
The Microarchitecture of the Pentium 4 Processor, external link: http://www.intel.com/technology/itj/q12001/pdf/art_2.pdf
Instruction Scheduling
Tomasulo's algorithm. original paper
Fisher, J. A. 1981. Trace scheduling: A technique for global microcode compaction. IEEE Trans. Comput. 13.
J. R. Allen, Ken Kennedy, Carrie Porterfield, Joe Warren ,
"Conversion of control dependence to data dependence," Proceedings of
the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming
languages, 1983
Hyesoon Kim, Onur Mutlu, Jared Stark, and Yale N. Patt,
"Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution"
Proceedings of the 38th International Symposium on Microarchitecture (MICRO), pages 43-54, Barcelona, Spain, November 2005
Cache and Memory
[CAC1] Lockup-free Instruction Fetch/Prefetch Cache Organization
paper by David Kroft, ISCA-08, 1981.
[SC]"Shared Memory Consistency Models: A Tutorial", Adve and
Gharachorloo, [external link:
http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-95-7.pdf]
S. Palacharla, R. E. Kessler, "Evaluating stream buffers as a
secondary cache replacement", April 1994 ACM SIGARCH Computer
Architecture News, Volume 22 Issue 2
[PREF]Data prefetch mechanisms, Steven P. Vanderwiel, David J. Lilja, ACM Computing Surveys, Vol. 32 Issue 2 (June 2000)
[RRER2]Jaekyu Lee, Hyesoon Kim, and Richard Vuduc,"When Prefetching Works, When It Doesn't, and Why",
ACM Transactions on Architecture and Code Optimization (TACO) Vol. 9, No. 1, pp.2:1-2:29, March 2012
Doug Joseph, Dirk Grunwald, "Prefetching using Markov predictors," June 1997 ISCA '97
T.-F. Chen and J.-L. Baer. A performance study of software and hardware data
prefetching schemes. In ISCA-21, pages 223Nb232, 1994.
"A stateless, content-directed data prefetching mechanism," Robert Cooksey, Stephan Jourdan, Dirk Grunwald, ASPLOS 2002.
Santhosh Srinath, Onur Mutlu, Hyesoon Kim, and Yale N. Patt,
"Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers"
Proceedings of the 13th International Symposium on High-Performance Computer Architecture (HPCA), pages 63-74, Phoenix, AZ,
Sunpyo Hong, Hyesoon Kim, "An Analytical Model for a GPU
Architecture with Memory-level and Thread-level Parallelism Awareness",
Proceedings of the 36th International Symposium on Computer
Architecture (ISCA), Austin, TX, June 2009.
Intel Larrabee
[LRB] Larrabee: a many-core x86 architecture for visual computing,
Larry Seiler, Doug Carmean, Eric Sprangle, Tom Forsyth, Michael Abrash,
Pradeep Dubey, Stephen Junkins, Adam Lake, Jeremy Sugerman, Robert
Cavin, Roger Espasa, Ed Grochowski, Toni Juan, Pat Hanrahan, August
2008,SIGGRAPH '08: SIGGRAPH 2008 papers
Memory schedulers
Onur Mutlu and Thomas Moscibroda,"Parallelism-Aware Batch
Scheduling: Enhancing both Performance and Fairness of Shared DRAM
Systems" Proceedings of the 35th International Symposium on Computer
Architecture (ISCA), pages 63-74, Beijing, China, June 2008.
Power
[pow1] C. Isci and M. Martonosi, "Runtime power monitoring in high-end processors: Methodology and
empirical data," in MICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on
Microarchitecture, (Washington, DC, USA), p. 93, IEEE Computer Society, 2003.
[pow2]K. Skadron, M. Stan, W. Huang, S. Velusamy, K. Sankaranarayanan, and D.Tarjan,
``Temperature-aware microarchitecture,'' in Computer Architecture, 2003.
Proceedings. 30th Annual International Symposium on , June 2003.
[pow3] Y.Zhang, D.Parikh, K.Sankaranarayanan, K.Skadron, and M.Stan,
``Hotleakage: A temperature-aware model of subthreshold and gate leakage for
architects,'' tech. rep., University of Virginia, 2003.
Heterogeneous architectures
R. Kumar, K.I. Farkas, N.P. Jouppi, P. Ranganathan, and D.M.
Tullsen, "Single-ISA Heterogeneous Multi-core Architectures: The
Potential for Processor Power Reduction," in Proceedings of the 36th
Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-36),
Dec. 2003
Interconnect
Boris Grot, Joel Hestness, Steve Keckler, and Onur Mutlu,
"Express Cube Topologies for On-Chip Interconnects"
Proceedings of the 15th International Symposium on High-Performance Computer Architecture (HPCA), pages 163-174, Raleigh, NC,