|dc.description.abstract||Mycobacterium tuberculosis, the etiological agent of tuberculosis, has adapted with the host environment and evolved to survive in harsh conditions in the host. The pathogen has successfully evolved strategies not only to evade the host immune system but also to thrive within the host cells. Upon infection, the pathogen is either cleared due to the host immune response, or it survives and causes active tuberculosis (TB) infection. In a number of cases however, the pathogen is neither killed nor does it actively proliferate, but it remains dormant in the host until the environment becomes favorable. This dormant state of pathogen is responsible for latent TB infection (LTBI). WHO reports indicated that as much as a third of the whole world’s population is exposed to the pathogen, of which a significant proportion could be latently infected (WHO report, 2015). These individuals do not show symptoms of active TB infection and hence are difficult to detect. The latent TB infected (LTBI) individuals serve as a reservoir for the pathogen, which can lead to epidemics when the conditions change. Hence, it is necessary to understand the host -pathogen interactions during LTBI, as this might provide clues to developing new strategies to detect and curb a latent infection.
Host-pathogen interactions are multifaceted, in which both species attempt to recognize and respond to each other, all of these through specific molecules making distinct interactions with the other species. The outcome of the infection is thus decided by a complex set of host-pathogen interactions. The complexity arises since a large number of molecular components are involved, also multiplicity of interactions among these components and due to several feedback, feed forwards or other regulatory or influential loops within the system. The complexity of biological systems makes modeling and simulation an essential and critical part of systems– level studies. Systems biology studies provide an integrated framework to analyze and understand the function of biological systems.
This work addresses some of these issues with an unbiased systems-level analysis so as to identify and understand the important global changes both in the host and in the pathogen during LTBI. The broad objectives of the work was to identify the key processes that vary in the host during latent infection, the set of metabolic reactions in the host which can be modulated to control the reactivation of infection, global adaptation in Mycobacterium tuberculosis (Mtb) and then to utilize this knowledge to identify strategies for tackling latent infection. A review of literature of the current understanding of latency from the pathogen and the host perspective is described in chapter 1. From this, it is clear that most available studies have focused on the role of individual molecules and individual biological processes such as granuloma formation, toll-like receptor signaling, T cell responses as well as cytokine signaling, in either initiating or maintaining a latent infection, but there is no report till date about whether and how these processes are connected with each other. While transcriptome based studies have identified lists of differentially expressed genes in LTBI as compared to healthy controls, no further understanding is currently available for many of them, regarding the processes they may be involved in and what interactions they make, which may be important for understanding LTBI.
The first part of the work is a systematic meta-analysis of genome-scale protein interaction networks rendered condition-specific with transcriptome data of patients with LTBI, which has provided a global unbiased picture of the transcriptional and metabolic variations in the host and in the pathogen during the latent infection. To start with, publicly available gene expression data related to LTBI, active TB and healthy controls were considered. In all, 183 datasets summing up to 105 LTBI, 41 active TB and 37 healthy control samples were analyzed. (Chapter 2). Standard analysis of the transcriptome profiles of these datasets indicated that there was zero overlap among them and that not a single gene was seen in common among all datasets for the same condition. An extensive human protein-protein interaction network was constructed using information available from multiple resources that comprehensively contained structural or physical interactions and genetic interactions or functional influences. Nodes in this network represented individual proteins and edges represented interactions between pairs of nodes. The identity of each node and the nature of interaction of each edge along with the type of evidence that was used as the basis for drawing the edge, was collated for the network. The gene expression data was integrated into the human protein-protein interaction (PPI) network for each condition, which essentially had weighted nodes and directed edges, specific to that condition, from which specific comparative networks were derived. The highest ranked perturbations in LTBI were identified through a network mining protocol previously established in the laboratory. This involved computing all versus all shortest paths on the comparative network, scoring the paths based on connectedness and various centrality measures of the nodes and the edges and finally ranking the paths based on the cumulative path scores. Intriguingly, the top-ranked set of perturbations were found to form a connected sub-network by themselves, referred to as a top perturbed sub-network (top-net), indicating that they were functionally linked or perhaps even orchestrated in some sense. Th17 signaling appears to be dominant. About 40 genes were identified in the unique set of LTBI condition as compared to the active TB condition, and these genes showed enrichment for processes such as apoptosis, cell cycle as well as natural killer cell mediated toxicity. Construction and analysis of a miRNA network indicated that 32 of these have strong associations with miRNA explaining the role of the latter in controlling LTBI. 3 other genes from the top-net are already established drug targets for different diseases with known drugs associated with them, which are BCL2, HSP90AA1 and NR3C1. These 3 proteins can be explored further as drug targets in LTBI whose manipulation using existing drugs may result in inhibiting the underlying biological process and thereby result in disturbing the state of latency.
As a second objective, global variations in the host transcriptome were identified during ascorbic acid induced dormancy (Chapter 3). Ascorbic acid or Vitamin C is a nutrient supplement required in the diet. This organic compound has a known antioxidant property, as it is known to scavenge the free radicals. In a recent study, Taneja et al, demonstrated that Vitamin C could induce dormancy in Mtb. On similar lines, experiments were done in THP-1 cells infected with Mtb to determine the host responses during ascorbic acid (AA) induced dormancy. The raw gene expression data was provided by our collaborator Prof. Jaya Tyagi that included 0 hour, 4 days and 6 days time points with infection and vitamin C versus infection alone or vitamin C alone as controls. The transcriptome data was normalized and integrated into the human PPI network as described for the meta-analyses. It was experimentally determined that ascorbic acid induces dormancy in 4 days post infection. The top-ranked paths of perturbation were analyzed and compared for three different conditions: (i) uninfected condition, (ii) AA treated and infected condition, and (iii) AA, isoniazid and infected condition. The dormant pathogen is known to be drug-tolerant and thus as a marker for the state of dormancy, the lack of effect of isoniazid is also monitored in the infected host cells. The analysis revealed that there were some broad similarities as compared to LTBI from patient samples but AA induced dormancy in cell lines stood out a separate group indicating that there were significant differences such as involving Interferon Induced Transmembrane Proteins (IFITMs), vacuolar ATPase as well as GDF15, which belongs to TGF-beta signaling pathway. The highest ranked perturbed paths contained genes involved in innate immune responses of which ISG15, IFITMs, HLAs and ATPases emerge as the most altered in the dormant condition. CCR7 emerges as a key discriminator, which is subdued in the latent samples but highly induced in infection conditions. Pathway-based analysis of different conditions showed that oxidative stress, glutathione metabolism, proteasome degradation as well as type II interferon signaling are significantly up-regulated in AA induced dormancy.
The dormant bacteria reside in the host cells and are known to modulate the host metabolism for their own benefit. So, the third objective was to understand the metabolic variations in the host during LTBI (Chapter 4). A genome-scale metabolic (GSM) model of alveolar macrophage was used in this study. The metabolic model contains information of the reactions, metabolites and the genes encoding enzymes that catalyze a particular reaction. Flux balance analysis (FBA), a constraint-based metabolic modeling method, is used for analyzing the alterations in the metabolism under different infection conditions. In order to mimic the physiological condition, gene expression data was used for constraining the bounds of the reactions in the model. Two different expression studies were used for analysis: GSE25534 (from Chapter 2) and ascorbic acid induced dormancy (Chapter 3). The analysis was carried out for latent TB versus healthy control and latent TB versus active TB to identify the most altered metabolic processes in LTBI. Differences in fluxes between the two conditions were calculated. A new classification scheme was devised to categorize the reactions on the basis of flux differences. In this chapter, higher fluxes in LTBI condition were identified for reactions involved in transport of small metabolites as well as amino acids. Solute carrier proteins responsible for the transport of the metabolites were identified and their biological significance is discussed. Reduced glutathione (GSH), arachidonic acid, prostaglandins, pantothenate were identified as important metabolites in LTBI condition and their physiological role has been described. Sub-system analysis for different conditions shows differential regulation for arachidonic acid metabolism, fatty acid metabolism, folate metabolism, pyruvate metabolism, glutathione metabolism, ROS detoxification, triacylglycerol synthesis and transport as well as tryptophan metabolism. From the study, transporter proteins and reactions altered during LTBI were identified, which again provide clues for understanding the molecular basis of establishing a latent infection.
Mycoabcterium tuberculosis is known to undergo dormancy during stress conditions. In this chapter, the main objective was to identify the global variations in the dormant Mtb (Chapter 5). To carry out the analysis, the Mtb PPI network was constructed using information from available resources. Gene expression data of two different dormancy models, Wayne growth model and multiple-stress model, were used for the study. To identify the key players involved in reversal of dormancy, the transcriptome data of reaeration condition was also used. In this study, the Max-flow algorithm was implemented to identify the feasible paths or flows in different condition. The flows with higher scores indicate that more information is traversed by the path, and hence is important for the study. From the analysis of Wayne growth model (hypoxia model), important transcriptional regulators such as SigB, SigE, SigH, regulators in the two-component system such as MprA, MtrA, PhoP, RegX3 and TrcR were identified in stress condition. Multiple-stress model studied the growth of bacteria in low oxygen concentration, high carbon dioxide levels, low pH and nutrient starvation. The gene expression data was integrated in the Mtb PPI network and implementation of Max-flow algorithm showed that MprA, part of the MprA-MprB two-component system, is involved in the regulation of persistent condition. WhiB1 also features in the paths of dormant condition and its role in persistence can be explored. In reaeration model, WhiB1 and WhiB4 are present in the top flows of this condition indicating that the redox state is perturbed in the pathogen and the interactions of these proteins are important to understand the reversal of dormant condition. From the study, Rv2034, Rv2035, HigA, Rv1989, Rv1990 and Rv0837 proteins belonging to toxin-antitoxin systems were also identified in the dormant bacteria, indicating their role in adaptation during stress condition. The role of Rv2034 has been studied in persistence, but the function of other proteins can be analyzed to provide new testable hypotheses about the role of these proteins in dormancy. Thus, the flows or paths perturbed during dormancy were identified in this study.
To get a better understanding of the metabolic network active in mycobacteria under different conditions, experiments were performed in Mycobacterium smegmatis MC2 155. The non-pathogenic strain of genus Mycobacteria, Mycobacterium smegmatis, is used as a surrogate to carry out molecular biology studies of Mtb. Mycobacterium smegmatis MC2 155 (Msm) is the commonly used laboratory strain for experimental purpose. In order to obtain a clear understanding of how comparable are the metabolic networks between the virulent M. tuberculosis H37Rv and the model system Msm, the latter model is first studied systematically. In Chapter 6, first the functional annotation of the Msm genome was carried out and the genes were categorized into different Tuberculist classes based on homology with the Mtb genome. A high-throughput growth characterization was carried out to characterize the strain systematically in terms of different carbon, nitrogen or other sources that promoted growth and thus served as nutrients and those that did not, together yielding a genome-phenome correlation in Msm. Gene expression was measured and used for explaining the observed phenotypic behavior of the organism. Together with the genome sequence, the transcriptome and phenome analysis, a set of about 257 different metabolic pathways were identified to be feasible in wild-type Msm. About 284 different carbon, nitrogen source and nutrient supplements were tested in this experiment and 167 of them supported growth of Msm. This indicates that the compounds enter the cells and are metabolized efficiently, thus yielding similar phenotypes. The expressed genes and metabolites supporting growth were mapped to the metabolic network of Msm, thus helping in the identification of feasible metabolic routes in Msm. A comparative study between Msm and Mtb revealed that these organisms share similarity in the nutrient sources that are utilized for growth. The study provides experimental proof to identify the feasible metabolic routes in Msm, and this can be used for understanding the metabolic capability in the two organisms under different conditions providing a basis to understand adaptations during dormancy.
In the last part of the work presented in this thesis, the metabolic shift in the pathogen was studied using a genome-scale metabolic model of Mtb (Chapter 7). The model contains information of the reactions, metabolites and genes involved in the reactions. Flux balance analysis (FBA) was carried out by integrating normalized gene expression data (Wayne model and multiple-stress model transcriptome considered in Chapter 5) to identify the set of reactions, which have a higher flux in the dormant condition as compared to the control replicating condition. Glutamate metabolism along with propionyl CoA metabolism emerge as major up-regulated processes in dormant Mtb. Next, with an objective of identifying essential genes in dormant Mtb, a systematic in silico single gene knock-out analysis was carried out where each gene and it's associated reaction was knocked out of the model, one at a time and the ability of the model to reach its objective function assessed. About 168 common genes in Wayne model and multiple-stress model were identified as important in Mtb after the knockout analysis. Essentiality is in essence a systems property and requires to be probed through multiple angles. Towards this, essential genes were identified in Mtb using a multi-level multi-scale systems biology approach. About 283 genes were identified as essential on the basis of combined analysis of transcriptome data, FBA, network analysis and phyletic retention studies in Mtb. 168 genes identified as important in dormant Mtb were compared with 283 essential genes and about 91 genes were found to be essential. Finally, among the set of essential genes, those that satisfy other criteria for a drug target were analyzed using the list of high-confidence drug targets of Mtb available in the laboratory along with their associated drug or drug-like molecules. 38 out of the 168 important genes in Mtb were found to have one or more drugs associated with them from the DrugBank database. Colchicin-Rv1655, Raloxifene-Rv1653, Bexarotene-Rv3804, Rosiglitazone-Rv3804 are top-scoring drug-target pairs that can be explored for killing dormant bacilli. The study has thus been useful in identifying important proteins, reactions and drug targets in dormant Mtb.
In summary, the thesis presents a comprehensive systems-level understanding of
various aspects of host responses and pathogen adaptation during latent TB
infection. Key host and pathogen factors involved in LTBI are identified that serve as
useful pointers for deriving strategies for tackling a latent infection.||en_US