• Login
    View Item 
    •   etd@IISc
    • Division of Electrical, Electronics, and Computer Science (EECS)
    • Computer Science and Automation (CSA)
    • View Item
    •   etd@IISc
    • Division of Electrical, Electronics, and Computer Science (EECS)
    • Computer Science and Automation (CSA)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Effective optimization techniques for a parallel file system

    View/Open
    T05786.pdf (24.58Mb)
    Author
    Raghvendran, M
    Metadata
    Show full item record
    Abstract
    Effective Optimization Techniques for a Parallel File System by Raghvendran M Significant work has been done in evolving parallel I/O architectures, I/O interfaces, and other programming techniques. However, only a few mechanisms currently exist that bridge the gap between the I/O architectures and the programming abstractions. A Parallel File System is the prime mechanism to deliver high-performance parallel I/O on multiprocessor machines for a wide class of scientific and engineering applications. With the evolution of commodity clusters (called High Performance Computation or HPC clusters) as a cost-effective computing platform for parallel computing, it is necessary to have an optimized and portable parallel file system to satisfy applications' I/O needs. The existing parallel I/O mechanisms on such clusters, based on NFS, provide dismal I/O performance due to architectural limitations in disallowing de-clustering of file data as well as due to the heavyweight nature of the protocol. Several other current I/O architectures, based on shared or cluster file systems, also perform poorly in the cluster-based parallel computing environment due to mismatched semantics between the application I/O characteristics and I/O architectural features. The parallel file system represents an appropriate split in the semantics in the parallel application I/O path, where parallel I/O mechanisms and other optimization techniques could be implemented at the I/O platform level and exported through feature-rich, platform-independent interfaces. In spite of significant research in parallel I/O techniques, portable parallel file systems do not incorporate these findings and are not commonly used. Many of the optimization techniques for parallel I/O in the literature, such as prefetching, have not had any general-purpose implementations nor have been validated for a wide class of application workloads or access patterns. There are many issues (such as timeliness) that need investigation for prefetching to be effective. The incorporation of parallel I/O optimization techniques in the commodity cluster setup has not been satisfactory. We establish the parallel file system as the right abstraction for parallel I/O on a commodity cluster from the performance and management perspectives. We also evaluate various optimization techniques for parallel file systems on a commodity cluster with the objective of providing a fast scratch space on a real cluster-based supercomputer such as the C-DAC PARAM Padma (ranked 171st in the July 2003 edition of the TOP500 list). We extend a data prefetching technique for the parallel file system architecture and demonstrate its effectiveness with a policy-based feedback loop. Other optimization techniques for improving a parallel file system are investigated to enhance its performance. This thesis makes contributions in the areas of analysis and design of these optimization techniques for a parallel file system, such as an online predictive prefetching mechanism with adaptive policy control, an adaptive flow control mechanism for supporting collective calls from the architectural perspective, and techniques for managing large data structures and efficient file processing in the file system design. A parallel file system incorporating the above-stated optimizations has been implemented on C-DAC's PARAM Padma, a one-teraflop 54-node cluster-based parallel processing system. These optimizations show significant improvement for the targeted application I/O workloads on this cluster.
    URI
    https://etd.iisc.ac.in/handle/2005/7209
    Collections
    • Computer Science and Automation (CSA) [516]

    etd@IISc is a joint service of SERC & J R D Tata Memorial (JRDTML) Library || Powered by DSpace software || DuraSpace
    Contact Us | Send Feedback | Thesis Templates
    Theme by 
    Atmire NV
     

     

    Browse

    All of etd@IIScCommunities & CollectionsTitlesAuthorsAdvisorsSubjectsBy Thesis Submission DateThis CollectionTitlesAuthorsAdvisorsSubjectsBy Thesis Submission Date

    My Account

    LoginRegister

    etd@IISc is a joint service of SERC & J R D Tata Memorial (JRDTML) Library || Powered by DSpace software || DuraSpace
    Contact Us | Send Feedback | Thesis Templates
    Theme by 
    Atmire NV