Improving Data Center Utilisation by Reducing Fragmentation
Abstract
Virtualization enables better server consolidation and utilisation compared to stand-alone
servers running a single workload. This enabled wide-spread cloud adoption among many
organizations. Data center utilisation is very important as it costs millions of dollars to
setup (capital expenditure), operate and maintain(operating expenditure). Many data
centers still suffer from poor host utilisations. Poor utilisation means resource idling, loss
of revenue and increased carbon footprint. Hence, this opens an opportunity to explore
options for using data center resources optimally.
This work defines resource fragmentation in the context of a data center's resources and
how it can be used as a metric for data center utilisation and discusses the key factors
affecting resource fragmentation. Some of the main factors are Virtual Machine(VM) Sizing,
Host Configuration and Virtual Machine Placement. Various VM Sizing approaches
- prede ned, ne-grained, exible and custom VM sizing, and how resource fragmentation
varies in each case is explained. These VM Sizing approaches are evaluated using
VM utilisation traces of a private data center. The number of hosts required to host
the workloads reduced by 32% when moved from pre-de ned VM Sizes to custom VM
Sizes. This work also shows the role of correlation of VM Sizes and host con guration
in determining resource fragmentation by evaluating di erent host con gurations using
the VM utilisation traces.
VM Placement algorithms also play a crucial role in determining data center resource
fragmentation. The problem of VM Placement is to obtain an optimal packing of VMs
on hosts i.e. the number of hosts required should be minimum. The problem being
NP-Hard, it becomes practically infeasible to get an optimal placement within the time
constraints for making scheduling decisions. VM Placement can be seen as a Multidimensional
Vector Packing Problem(MDVPP). VPSolver, using arc- ow formulation
with graph compression, gives an optimal solution for Bin-Packing and related problems.
This thesis proposes grouping-based heuristic to solve for large instances of MDVPP,
based on the Divide-and-Conquer paradigm, using VPSolver. An extensive evaluation,
of 3510 instances, comparing the proposed heuristic with existing popular heuristics
in this space is done and it was observed that for most large instances, the proposed
heuristic gives better solutions compared to existing ones sometimes at the cost of higher
computation time taken. With the proposed heuristic, the number of bins required is
reduced upto 8.15%, for larger instances, compared to existing heuristics.