Exploring the Fundamental Limits of Information-Theoretically Secure Key Generation and DNA-Based Data Storage
Abstract
In this dissertation, we carry out an exploration of the fundamental limits of information-theoretic security in two different settings: multiterminal key agreement and DNA-based data storage. Most of the dissertation focuses on the problem of secret key agreement within the multiterminal source model introduced by Csiszár and Narayan (2004). In the last part of the dissertation, we formulate and characterize a notion of secure storage capacity in the context of DNA-based data storage. 
In the multiterminal source model, there is an underlying source that generates independent and identically distributed (i.i.d.) realizations of a random vector. The components of the random vector are finitely-supported random variables distributed according to a joint probability distribution that is specified in the model. Additionally, there is a finite set of users and a wiretapper, each of whom observes the realizations from some subset of the components of the source.  The aim of the users is to interactively communicate over a noiseless public channel so that each user finally outputs a common random variable called a secret key, which is required to be secure from the wiretapper. The main object of interest is the wiretap secret key capacity, which is the maximum possible rate of a secret key that can be generated by the users. A substantial part of this dissertation attempts to gain insight into the single-letter characterization of wiretap secret key capacity, which is a problem that has remained largely unsolved.
In the initial part of this dissertation, we explore the connection between secret key agreement and secure omniscience. The problem of secure omniscience is concerned with communication protocols for omniscience that minimize the rate of information leakage to the wiretapper. Our interest is in identifying broad classes of source models for which the wiretap secret key capacity can be achieved through an omniscience protocol that leaks the least possible amount of information to the wiretapper. In such cases, we say that there is a duality between secure omniscience and secret key agreement. 
In this dissertation, we show that this duality holds in the case of certain finite linear source (FLS) models, such as two-terminal FLS models and pairwise independent network models on trees with a linear wiretapper. On the other hand, we also give an example of a (non-FLS) source model for which the duality does not hold if we limit ourselves to communication-for-omniscience protocols with at most two (interactive) communications. 
Next, we give a characterization of the wiretap secret key capacity under two special situations: one is when the users are not allowed to communicate, and the other is when the communication rate goes to zero asymptotically. We show that both these characterizations have the same single-letter expression, which can be expressed in terms of the maximal common function.
Our focus then shifts to the following question: When can the users generate a positive rate secret key? We give necessary and sufficient conditions for the secret key capacity to be positive, which extend known results on two-terminal sources to the multiterminal setting. For the special case of hypergraphical sources with a linear wiretapper, we derive a simpler equivalent condition for the positivity of secret key capacity in terms of the maximal common function.
 
The final part of this dissertation is concerned with the study of DNA-based secure data storage. In this problem, a user (Alice) would like to store her data within a pool of (synthetic) DNA molecules so as to be reliably retrieved by an authorized party (Bob) while ensuring that an unauthorized party (Eve) gets almost no information from her (Eve's) observations. We propose a strategy for making DNA-based data storage information-theoretically secure through the use of wiretap channel coding. This motivates us to extend the shuffling-sampling channel model of Shomorony and Heckel (2021) to include a wiretapper. The main contribution is a characterization of the secure storage capacity of our DNA wiretap channel model, which is the maximum rate at which Alice can securely store her data within a pool of DNA molecules.

