Discovery Of Application Workloads From Network File Traces

Yadwadkar, Neeraja

dc.contributor.advisor	Bhattacharyya, Chiranjib
dc.contributor.author	Yadwadkar, Neeraja
dc.date.accessioned	2011-05-19T06:54:52Z
dc.date.accessioned	2018-07-31T04:40:17Z
dc.date.available	2011-05-19T06:54:52Z
dc.date.available	2018-07-31T04:40:17Z
dc.date.issued	2011-05-19
dc.date.submitted	2009
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/1213
dc.identifier.abstract	https://etd.iisc.ac.in/static/etd/abstracts/1574/G23698-Abs.pdf	en_US
dc.description.abstract	An understanding of Input/Output data access patterns of applications is useful in several situations. First, gaining an insight into what applications are doing with their data at a semantic level helps in designing efficient storage systems. Second, it helps to create benchmarks that mimic realistic application behavior closely. Third, it enables autonomic systems as the information obtained can be used to adapt the system in a closed loop. All these use cases require the ability to extract the application-level semantics of I/O operations. Methods such as modifying application code to associate I/O operations with semantic tags are intrusive. It is well known that network file system traces are an important source of information that can be obtained non-intrusively and analyzed either online or offline. These traces are a sequence of primitive file system operations and their parameters. Simple counting, statistical analysis or deterministic search techniques are inadequate for discovering application-level semantics in the general case, because of the inherent variation and noise in realistic traces. In this paper, we describe a trace analysis methodology based on Profile Hidden Markov Models. We show that the methodology has powerful discriminatory capabilities that enables it to recognize applications based on the patterns in the traces, and to mark out regions in a long trace that encapsulate sets of primitive operations that represent higher-level application actions. It is robust enough that it can work around discrepancies between training and target traces such as in length and interleaving with other operations. We demonstrate the feasibility of recognizing patterns based on a small sampling of the trace, enabling faster trace analysis. Preliminary experiments show that the method is capable of learning accurate profile models on live traces in an online setting. We present a detailed evaluation of this methodology in a UNIX environment using NFS traces of selected commonly used applications such as compilations as well as on industrial strength benchmarks such as TPC-C and Postmark, and discuss its capabilities and limitations in the context of the use cases mentioned above.	en_US
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	G23698	en_US
dc.subject	File Tracing (Computer Networks)	en_US
dc.subject	Computer Communication	en_US
dc.subject	Profile Hidden Markov Models	en_US
dc.subject	Sequence Alignment	en_US
dc.subject	Network File System (NFS)	en_US
dc.subject	Network File Traces	en_US
dc.subject	Hidden Markov Models (HMMs)	en_US
dc.subject.classification	Computer Science	en_US
dc.title	Discovery Of Application Workloads From Network File Traces	en_US
dc.type	Thesis	en_US
dc.degree.name	MSc Engg	en_US
dc.degree.level	Masters	en_US
dc.degree.discipline	Faculty of Engineering	en_US

Files in this item

Name:: G23698.pdf
Size:: 1.259Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Computer Science and Automation (CSA) [545]

Show simple item record