Show simple item record

dc.contributor.advisorLakshmi, J
dc.contributor.authorGhosh, Archita
dc.date.accessioned2024-01-19T11:09:30Z
dc.date.available2024-01-19T11:09:30Z
dc.date.submitted2023
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/6385
dc.description.abstractCloud storage service brought the idea of a global scale storage system available on-demand and accessible from anywhere. Despite the benefits, resiliency remains one of the key issues that hinder the wide adaptation of storage services. The data is hosted on cloud data centers containing hundreds of thousands of commodity-grade hardware with layers of complex software. Failures due to system crashes, natural disasters, cyber-attacks, etc., are common and frequent in such environments. To keep the service unaffected by such events, resiliency is essential for cloud systems. For storage services, resiliency is far more critical because losing access to data or, more importantly, a complete data loss can have a catastrophic impact on the client. The existing works on storage resiliency focus on maintaining sufficient user data redundancy in the system to maintain a reliable service. However, providing a global-scale storage solution requires various functional and management layers to ensure the service is accessible and all the stored items are durable. The first part of our work proves that resiliency at the stored data level does not guarantee service level reliability. A generic cloud storage system model is designed to analytically show that the reliability achieved at the service level drastically differs from the reliability ensured by stored data redundancy. This motivates us to bring the entire system into purview to understand cloud storage resiliency. Due to the complexity and variation of large-scale storage architectures, assessing end-to-end storage resiliency is a challenging task. To achieve this, the second part of the work proposes a generic resiliency evaluation method for cloud storage services. The method identifies the essential functional layers for storage service and the components constituting the layers. It then performs an in-depth behavior analysis during all possible failures of each component. The method is used to assess the resiliency of two diverse and real-world cloud storage services, OpenStack Swift and CephFS. The analysis identifies various resiliency weak points in the service architectures and depicts the effectiveness of different resiliency methods used at various layers. The third part of the work extends the resiliency evaluation method to understand the correlation of resiliency with the service usage pattern. A storage service can be used for different use cases resulting in the variation of request interarrival time, read and write ratio, accessed data and metadata, etc. Hence, the components involved in access sequences may differ, and so can their failure impact. Using the improved resiliency evaluation method and access patterns identified from real traces, we show that resiliency can be selective and dynamically adjusted based on workloads without affecting service reliability. Finally, the work defines an end-to-end resiliency analysis framework for cloud storage services that enables quantification, comparison, and optimization of cloud storage resiliency. The framework allows effective modeling of cloud storage resilience by combining the resiliency of each component participating in service reliability maintenance for specific workloads. The framework successfully models the resiliency of OpenStack Swift and CephFS as Stochastic Petri Nets (SPNs). The models are used to quantify and compare the resiliency of the above two service architectures and demonstrate how to optimize resiliency while achieving expected service reliability.en_US
dc.language.isoen_USen_US
dc.relation.ispartofseriesET00397
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertationen_US
dc.subjectCloud Computingen_US
dc.subjectEnd-to-end Resiliencyen_US
dc.subjectComplex Systemsen_US
dc.subjectDistributed Systemsen_US
dc.subjectService Reliabilityen_US
dc.subjectCloud Storage Servicesen_US
dc.subjectStorage Service Architectureen_US
dc.subjectData Redundancyen_US
dc.subjectAvailabilityen_US
dc.subjectDurabilityen_US
dc.subjectTrace-driven Analysisen_US
dc.subjectResiliency Modelingen_US
dc.subjectReliability Block Diagram (RBD)en_US
dc.subjectStochastic Petri Net (SPN)en_US
dc.subject.classificationResearch Subject Categories::TECHNOLOGYen_US
dc.titleEnd-to-end Resiliency Analysis Framework for Cloud Storage Servicesen_US
dc.typeThesisen_US
dc.degree.namePhDen_US
dc.degree.levelDoctoralen_US
dc.degree.grantorIndian Institute of Scienceen_US
dc.degree.disciplineEngineeringen_US


Files in this item

This item appears in the following Collection(s)

Show simple item record