Broadly speaking, the first half of of the course will look at systems that already exist, focusing primarily on offerings from Intel and IBM.  The second half will look at the ideas being proposed in the most recent research literature. 

In addition to the readings below, you will often need to look up things on your own.  For example, you may encounter a term or a concept in a paper that you do not understand.  In that case you will need to research that topic on your own to fill gaps in your understanding.  I recommend getting a copy of Hennessey and Patterson's "Computer Architecture: A Quantitative Approach" to fill in the gaps.

Since modern processors are so complex, there is no way that we can grasp them in a single session.  Thus, oftentimes we will discuss a processor over multiple sessions; this means that multiple presenters will need to work together on a topic.  It is up to the presenters to split up the presentation amongst them.  E.g., the first presenter may spend a class period giving a relatively high-level overview of the processor and the second presenter may pick one or two particularly interesting aspects of the processor and go over them in gory details.

Date Readings Presenter
1/17 Introduction to seminar

Introduction to modern hardware components (memory system components, branch predictors)

Diwan
1/19 Vertical profiling, Hauswirth, Sweeney, Diwan and Hind, OOPSLA 2004 (ppt) Diwan
1/24 Hardware performance monitors and PAPI (pages 6, 12-17, 37-42, and 46-59 from the PAPI document) Tipp Moseley
1/26 HPM continued: DCPI Tipp Moseley
1/31 Pentium 4 article from Hardware secrets, Pentium 4 article from Microprocessor report (August 2000), Pentium 4 article from Intel technical journal, and Doug Carmean et al's paper, and talk, Todd
2/2 Pentium 4 continued (focus on the trace cache and hyperthreading) Brian
2/7 Pentium M article from Intel technical journal Joseph
2/9 Pentium M versus Pentium 4 from Tom's Hardware Joseph
2/14 Understanding performance

Each student should write a program that yields the highest IPC and one that yields the lowest IPC that you can manage.  Do this on the Pentium 4 using PAPI.  Be prepared to present your programs in class and discuss them.

 
2/16 Optimizing for the Pentium (Chapter 6 of this) Luke
2/21 Optimizing for the Pentium (Chapter 7 of this) Laura
2/23 Optimizing for the Pentium Discussion

Each student should conduct and report on experiments to determine the impact of some of the recommendations for "Optimizing for the Pentium"

 
2/28 AMD K8 (article, article from anand tech, and slides) Tipp
3/2 IBM Power 4 Amer
3/7 No class  
3/9 POWER5 (article1 and article2 from IBM Journal) Joseph
3/14 Hyperthreading in POWER5 (article from IBM journal) Tipp
3/16 Hassan's thesis defence  
3/21 Hardware trends (Patterson article in CACM and discussion) Laura
3/23 Memory trends (Steven Woo's preprint) Todd
4/4 Virtualization in POWER 5 Luke
4/6 Steven Swanson's talk 3:30 p.m. in ECCR 265  
4/11 Niagara Todd
4/13 Transactional memory (Saha et al.) Laura
4/18 Transactional memory (Herlihy and Moss) Luke
4/20 Transactional memory in Azul  
4/25 No class: Amer is out of town  
4/27 No class: Amer is out of town  
5/2    
5/4