Off-campus UMass Amherst users: To download dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users, please click the view more button below to purchase a copy of this dissertation from Proquest.

(Some titles may also be available free of charge in our Open Access Dissertation Collection, so please check there first.)

A hardware/software co-design architecture for thermal, power, and reliability management in chip multiprocessors

Omer Khan, University of Massachusetts Amherst


Today’s designs are being shaped by the challenges of nano-CMOS technologies: increased power density, rising junction temperatures, and rising rate of errors and device failures that constrain average rate of power dissipation, and design technologies that limit peak power delivery. This thesis focuses on how to leverage the hardware and software abstraction layers of today’s systems. Several of the low level hardware details such as power, hotspots, and faults are tightly correlated to interactions within the system including application and hardware behavior. The conventional approach to tackling such problems come with additional costs and design complexity, and they are limited due to the strict abstraction layers of today’s systems. It is a well-known phenomenon that application programs exhibit repetitive and recognizable behavior during the course of their execution. Taking advantage of this time varying behavior at runtime can enable fine-grain optimizations. This thesis proposes a low overhead and scalable hardware based program phase classification scheme, termed as Instruction Type Vectors (ITV) which captures the execution frequencies of committed instruction types over profiling intervals and subsequently classifies and detects phases within threads. ITV reveals the computational demands by exposing the instruction type distribution of phases to the system. This thesis proposes several applications of ITV. Based on the past history of the rate of change of temperature, an in-time response to thermal emergencies within cores is proposed. ITV improves the accuracy of thread level temperature prediction, thus allowing the multi-core to operate at its optimal performance, while keeping the cores thermally saturated. To enable power management, a selective set of key hardware structures within cores are proposed to dynamically adapt to the computational demands of the application threads. ITV is proposed to speculatively trade off power and performance at the granularity of phases and based on this information, selectively power gate structures. Finally, ITV is used to map threads to cores when faults disable or degrade capability of cores. The system observes the phase behavior and initiates thread relocation to match the computation demands of threads to the capabilities of cores. This allows the system to exploit intercore redundancy for fault tolerance in a multi-core.

Subject Area

Electrical engineering

Recommended Citation

Khan, Omer, "A hardware/software co-design architecture for thermal, power, and reliability management in chip multiprocessors" (2010). Doctoral Dissertations Available from Proquest. AAI3397717.