Fault tolerant locking for shared disk filesystems
Abstract
Shared disk files ystems are crucial for providing direct, shared and high performance
file system access on top of raw block based Storage Area Networks. One of the key
components of a shared disk file system is a distributed lock manager which synchronizes
direct concurrent access to data & metadata from multiple autonomous hosts. The lock
manager should be fault tolerant for the entire system to be highly available. A design
for a fault tolerant multicast based lock manager for the open source GFS shared disk
file system is presented here. The GFS file system runs on Linux and currently lacks a
fault tolerant distributed lock manager. The protocol presented here uses an efficient
multicast based scheme to minimize the number of network hops involved in lock acquisition.
Minimizing the number of network hops becomes very significant when dirty
data associated with the lock is transferred through the interconnection network rather
than doing disk writes. Ordered message and failure delivery needed for such a protocol
was provided by the use pf a group communication toolkit. An existing group communication
system (GCS) was ported to the kernel and used. The use of this communication
system simplified the design and implementation of the protocol considerably. The GCS
is expected to be useful in the design of other distributed protocols as well. The system
was implemented and the performance tests were carried out on a Fibre channel based
SAN setup. The performance was found to be as well as the current non fault tolerant
system.

