Date of Award


Document type


Access Type

Open Access Dissertation

Degree Name

Doctor of Philosophy (PhD)

Degree Program

Computer Science

First Advisor

Prashant Shenoy

Second Advisor

Mark D. Corner

Third Advisor

James Kurose

Subject Categories

Computer Sciences


The increasing demand for storage and computation has driven the growth of large data centers--the massive server farms that run many of today's Internet and business applications. A data center can comprise many thousands of servers and can use as much energy as a small city. The massive amounts of computation power contained in these systems results in many interesting distributed systems and resource management problems. In this thesis we investigate challenges related to data centers, with a particular emphasis on how new virtualization technologies can be used to simplify deployment, improve resource efficiency, and reduce the cost of reliability, all in application agnostic ways. We first study problems that relate to the initial capacity planning required when deploying applications into a virtualized data center. We demonstrate how models of virtualization overheads can be utilized to accurately predict the resource needs of virtualized applications, allowing them to be smoothly transitioned into a data center. We next study how memory similarity can be used to guide placement when adding virtual machines to a data center, and demonstrate how memory sharing can be exploited to reduce the memory footprints of virtual machines. This allows for better server consolidation, reducing hardware and energy costs within the data center. We then discuss how virtualization can be used to improve the performance and efficiency of data centers through the use of "live'' migration and dynamic resource allocation. We present automated, dynamic provisioning schemes that can effectively respond to the rapid fluctuations of Internet workloads without hurting application performance. We then extend these migration tools to support seamlessly moving applications across low bandwidth Internet links. Finally, we discuss the reliability challenges faced by data centers and present a new replication technique that allows cloud computing platforms to offer high performance, no data loss disaster recovery services despite high network latencies.