Minimizing latency in data acquisition, distributed processing, storage and retrieval

Jinan, Rooji

View/Open

Thesis full text (2.613Mb)

Author

Jinan, Rooji

Metadata

Show full item record

Abstract

Achieving low latency is of utmost importance in applications demanding real-time sensing and control as in cyber-physical systems. In this thesis, we explore three different facets of ensuring low latency in such systems. First part of this manuscript considers the design of an encoder-decoder system to facilitate real-time tracking of a physical process modelled as a first order auto-regressive process. Samples from a high-dimensional first-order auto-regressive process generated by an independently and identically distributed random innovation sequence are observed by a sender which can communicate only finitely many bits per unit time to a receiver. The receiver seeks to form an estimate of the process value at every time instant in real-time. We consider a time-slotted communication model in a slow-sampling regime where multiple communication slots occur between two sampling instants. We propose a successive update scheme which uses communication between sampling instants to refine estimates of the latest sample and study the following question: Is it better to collect communication of multiple slots to send better refined estimates, making the receiver wait more for every refinement, or to be fast but loose and send new information in every communication opportunity? We show that the fast but loose successive update scheme with ideal spherical codes is universally optimal asymptotically for a large dimension. However, most practical quantization codes for fixed dimensions do not meet the ideal performance required for this optimality, and they typically will have a bias in the form of a fixed additive error. Interestingly, our analysis shows that the fast but loose scheme is not an optimal choice in the presence of such errors, and a judiciously chosen frequency of updates outperforms it. Next part considers designing load balancing policies that ensures low latency without needing extra overhead in terms of server-side feedback or coordination among the servers. Dispatching policies such as the join shortest queue (JSQ), join smallest work (JSW) and their power of two variants are used in load balancing systems where the instantaneous queue length or workload information at all queues or a subset of them can be queried. In situations where the dispatcher has an associated memory, one can minimize this query overhead by maintaining a list of idle servers to which jobs can be dispatched. Recent alternative approaches that do not require querying such information include the cancel on start and cancel on complete based replication policies. The downside of such policies however is that the servers must communicate the start or completion of each service to the dispatcher and must allow cancellation of redundant copies. In practice, the requirements of query messaging, memory, and replica cancellation pose challenges in their implementation and their advantages are not clear. We consider load balancing policies that do not query load information, do not have a memory, and do not cancel replicas. Surprisingly, we were able to identify operating regimes where such policies have better performance when compared to policies that utilize server feedback information. Our policies allow the dispatcher to append a timer to each job or its replica. A job or a replica is discarded if its timer expires before it starts getting served. We analyze several variants of this policy which are novel, simple to implement, and also have remarkably good performance, despite no feedback from servers to the dispatcher. Finally, we consider the setting of a distributed storage system where a single file is subdivided into smaller fragments of same size which are then replicated with a common replication factor across servers of identical cache size. An incoming file download request is sent to all the servers, and the download is completed whenever the request gathers all the fragments. At each server, we are interested in determining the set of fragments to be stored, and the sequence in which fragments should be accessed, such that the mean file download time for a request is minimized. We model the fragment download time as an exponential random variable independent and identically distributed for all fragments across all servers, and show that the mean file download time can be lower bounded in terms of the expected number of useful servers summed over all distinct fragment downloads. We present deterministic storage schemes that attempt to maximize the number of useful servers. We show that finding the optimal sequence of accessing the fragments is a Markov decision problem, whose complexity grows exponentially with the number of fragments. We propose heuristic algorithms that determine the sequence of access to the fragments which are empirically shown to perform well.

URI

https://etd.iisc.ac.in/handle/2005/6029

Collections

Robert Bosch Centre for Cyber Physical Systems (RBCCPS) [12]