The emergence of large-scale Internet services coupled with the evolution of computing technologies such as distributed systems, parallel computing,utility computing, grid, and virtualization has fueled a movement toward a new resource provisioning paradigm called cloud computing. The main appeal of cloud computing lies in its ability to provide a shared pool of infinitely scalable computing resources for cloud services, which can be quickly provisioned and released on-demand with minimal effort.
The rapidly growing interest in cloud computing from both the public and industry together with the rapid expansion in scale and complexity of cloud computing resources and the services hosted on them have made monitoring, controlling, and provisioning cloud computing resources at runtime into a very challenging and complex task. This thesis investigates algorithms, models and techniques for autonomously monitoring, controlling, and provisioning the various resources required to meet services' performance requirements and account for their resource usage.
Quota management mechanisms are essential for controlling distributed shared resources so that services do not exceed their allocated or paid-for budget. Appropriate cloud-wide monitoring and controlling of quotas must be exercised to avoid over- or under-provisioning of resources. To this end, this thesis presents new distributed algorithms that efficiently manage quotas for services running across distributed nodes.
Determining the optimal amount of resources to meet services' performance requirements is a key task in cloud computing. However, this task is extremely challenging due to multi-faceted issues such as the dynamic nature of cloud environments, the need for supporting heterogeneous services with different performance requirements, the unpredictable nature of services' workloads, the non-triviality of mapping performance measurements into resources, and resource shortages.
Models and techniques that can predict the optimal amount of resources needed to meet service performance requirements at runtime irrespective of variations in workloads are proposed. Moreover, different service differentiation schemes are proposed for managing temporary resource shortages due to e.g. flash crowds or hardware failures.
In addition, the resources used by services must be accounted for in order to properly bill customers. Thus, monitoring data for running services should be collected and aggregated to maintain a single global state of the system that can be used to generate a single bill for each customer. However, collecting and aggregating such data across geographical distributed locations is challenging because the management task itself may consume significant computing and network resources unless done with care. A consistency and synchronization mechanism that can alleviate this task is proposed.