Off-campus UMass Amherst users: To download dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users, please click the view more button below to purchase a copy of this dissertation from Proquest.

(Some titles may also be available free of charge in our Open Access Dissertation Collection, so please check there first.)

Architecture and technology tradeoffs in the design of high performance microprocessor -based systems

David Henry Albonesi, University of Massachusetts Amherst


Increasing levels of VLSI integration present new opportunities, and new challenges, for designers of high performance microprocessor-based systems. With more transistors at their disposal, architects are faced with complex decisions regarding processor features, cache hierarchies, and supporting several uniprocessor and multiprocessor target systems. In addition, as the speed gap between microprocessors and board-level technology continues to widen, a robust system-level design becomes a critical element for attaining acceptable performance.^ This dissertation describes STATS, a comprehensive, semi-automated, trade-off analysis toolset. STATS overcomes the limitations of previous approaches by including the processor, cache hierarchy, system interconnect, and main memory designs, technology and architectural considerations, and both uniprocessor and multiprocessor analysis, within a single framework. STATS employs a judicious combination of compilation, execution-driven simulation, analytical modeling, and Spice analysis tools to achieve a reasonable balance of accuracy and analysis time.^ STATS is used in three architectural investigations. The first, an in-depth analysis of cache hierarchy alternatives for the Alpha 21064A processor design, includes a comparison of employing one, two, or three levels of hierarchy. A detailed analysis demonstrates the importance of precisely characterizing all aspects of cache hierarchy design, including traffic rates, miss ratios, cycle time, latency, and bandwidth, to avoid incorrect design decisions.^ The second explores tradeoffs in the design of a next-generation 8-way super-scalar microprocessor-based workstation. Some conclusions are that trading off a smaller L1 Dcache size for more arithmetic units provides the best overall performance, and only marginal performance gains are obtained by using the package pins for an L3 cache rather than a direct main memory connection. Novel mechanisms for multi-porting L1 Dcaches and pipelining large, on-chip L2 caches are shown to achieve up to an 81% performance improvement over conventional methods.^ The third investigation concerns the cluster design of CC-NUMA multiprocessors using the 8-way superscalar microprocessor. The results demonstrate that integrating the main memory controller onto the microprocessor die considerably reduces bus utilization and improves multiprocessor performance by as much as 35%. Interleaving alternatives for the distributed main memory are explored, as well as options for managing bus utilization in future cluster designs. ^

Subject Area

Electrical engineering|Computer science

Recommended Citation

Albonesi, David Henry, "Architecture and technology tradeoffs in the design of high performance microprocessor -based systems" (1996). Doctoral Dissertations Available from Proquest. AAI9709569.