Nucleic Acid-binding Adenylyl Cyclases in Mycobacteria : Studies on Evolutionary & Biochemical Aspects
Mycobacterium tuberculosis is one of the most successful human pathogens, estimated to have infected close to one-third of the global human population. In order to survive within its host, M. tuberculosis utilises multiple signalling strategies, one of them being synthesis and secretion of universal second messenger cAMP. This process is enabled by the presence of sixteen predicted adenylyl cyclases in the genome of M. tuberculosis H37Rv, ten of which have been characterised in vitro. The synthesized cAMP is recognised by ten putative cAMP-binding proteins in which the cyclic AMP-binding domain is associated with a variety of enzymatic domains. The cAMP signal can be extinguished by degradation by phosphodiesterase’s, secretion into the extracellular milieu or via sequestration of the nucleotide by upregulation of a high-affinity cAMP-binding protein. Of the sixteen adenylyl cyclases (ACs) encoded by M. tuberculosis H37Rv, a subset of multidomain adenylyl cyclases remain poorly characterised, primarily due to challenges associated with studying these in vitro. The adenylyl cyclase domain in these proteins is associated with an NB-ARC domain (nucleotide binding domain common to APAF-1, plant R proteins and CED-4), a TPR domain (tetratricopeptide repeat) and an LuxR-type HTH motif (helix-turn-helix). This architecture places these multidomain mycobacterial ACs within a larger group of STAND (Signal transduction ATPase’s with numerous domains) proteins, and hence they will be referred to as STAND ACs. The STAND proteins are a recently recognised class of multidomain ATPases which integrate a variety of signals prior to activation. Activation is accompanied by formation of large oligomeric signalling hubs which facilitate downstream signalling events. While most STAND proteins have a single effector domain followed by an NB-ARC domain and a scaffolding domain, the STAND ACs distinguish themselves by retaining two effector domains, the AC domain and the HTH domain, at the N- and C- termini respectively. The cyclase, NB-ARC, TPR and HTH domains have widely divergent taxonomic distributions making the presence of these four domains in a single polypeptide rare. In fact, proteins with cyclase-NB-ARC-TPR-HTH (C-A-T-H) domain organisation were found to be encoded almost exclusively by slow growing mycobacterial species, a clade that harbours most mycobacterial pathogens, such as M. tuberculosis and M. leprae. Notably, one of the STAND ACs, Rv0386, is the only mycobacterial AC shown till date to be required for virulence of M. tuberculosis in mice. Using phylogenetic, the evolutionary underpinnings of this domain architecture were examined. The STAND ACs appear to have most likely evolved via a domain gain event from a cyclase-ATPase-TPR progenitor encoded by a strain ancestral to M. marina. Subsequently, the genes duplicated and diverged, sometimes leading to frameshift mutations splitting the cyclase domain from the C-terminal domains. Consequently, M. tuberculosis encodes for three ‘full-length’ STAND ACs, namely, Rv0386, Rv1358 and Rv2488c and one split STAND AC. The split STAND AC is made up of Rv0891c, containing the AC domain, and Rv0890c, containing the NB-ARC, TPR and HTH domains. rv0891c and rv0890c were found to be expressed as an operatic transcript, though they were translationally uncoupled. Pertinently, M. Canetti, an early-branching species of the M. tuberculosis complex, contains an orthologue of Rv0891c and Rv0890c where all four domains are present in a single polypeptide. Sequence analysis of the four STAND ACs in M. tuberculosis allowed predictions of significant divergence in function. These proteins showed high sequence conservation in their HTH domains, with substantial sequence divergence in their TPR, NB-ARC and AC domains. Biochemical analysis on the AC domains revealed that Rv0891c and Rv2488c possessed poor or no AC activity, respectively. On the other hand, the cyclase domain of Rv0386 could catalyse cAMP synthesis. Moreover, for both Rv0891c and Rv0386, presence of the C-terminal domains potentiated adenylyl cyclase activity, suggestive of allosteric regulation within the STAND AC module. Studies on Rv0891c also revealed that the protein could inhibit the adenylyl cyclase activity of Rv0386 in trans. This result thus provided a novel mechanism by which proteins harbouring poorly active/inactive adenylyl cyclase domains could contribute to cAMP levels, by acting as inhibitors of other adenylyl cyclases. The STAND ACs were found to be inactive ATPases. Additionally, incubation with nucleotides did not stimulate oligomerisation of these proteins, unlike what has been shown for several other STAND proteins. However, mutations in the NB-ARC domain perturbed the basal oligomeric state of these proteins, indicating that the NB-ARC domain can influence self- association. A subset of NB-ARC domain mutants also showed increased adenylyl cyclase activity, reiterating the inter-domain cross-talk in the STAND ACs. Since the AC activity of these proteins was meagre, the properties of the HTH domain were examined, as an alternative effector domain. Genomic SELEX was performed using the TPR-HTH domains of Rv0890c, and revealed a set of sequences that bound to this protein, though they lacked common sequence features. Further analysis revealed that Rv0890c bound to DNA in a sequence-independent manner, through the HTH domain. This binding was cooperative with multiple protein units engaging in DNA-binding. Due to the cooperative nature of binding and the lack of sequence preference, Rv0890c appeared coat the DNA molecule. This was further proved by the ability of Rv0890c to protect DNA from DNaseI-mediated degradation, and the requirement for long DNA sequences to form stable DNA-protein complexes. Studies also revealed that Rv0890c interacted with RNA and ssDNA. In fact, the protein as purified from heterologously expressing E. coli cells was bound to RNA. RNA-binding by a LuxR-type HTH has not been reported previously, providing a new function for this class of HTHs. Interestingly, nucleic acid-binding by a fusion Rv0891c-Rv0890c protein, similar to the one encoded in M. canetti, was shown to stimulate adenylyl cyclase activity. This was likely due to a relief of inhibitory interactions between the TPR-HTH and the AC domains, on DNA-binding. Given the high sequence similarity between the HTH domains of the STAND ACs, they were expected to bind to DNA in an identical manner. Indeed, the HTH domains of Rv0386 and Rv1358 engaged with DNA with an identical affinity as Rv0890c. Sequence comparisons in the HTH domain enabled identification of conserved basic residues, of which one, R850 was essential for nucleic acid-binding. Surprisingly however, Rv0386 and Rv1358 did not exhibit RNA-binding, pointing towards functional divergence of Rv0890c from its paralogues. Since the HTH domains of the STAND ACs were highly conserved, it was possible that the ability to bind to RNA was instead dictated by the adjacent TPR modules. To examine this possibility, TPR domains were swapped between Rv0890c and Rv0386. Interestingly, both the chimeric proteins showed a reduced ability to bind to DNA, while showing a complete absence of RNA- binding. These results suggested that the TPR domains were critical in modulating nucleic acid-binding. Moreover, the effect of the TPR domain was context-dependent, since the presence of non-cognate TPR domains hampered nucleic acid-binding. However, the ability to bind to RNA was not solely governed by the TPR domain since the Rv0890cTPR-Rv0386HTH chimeric protein did not show RNA-binding, in spite of containing a permissive TPR domain. To further dissect the molecular requirements for RNA-binding, the conservation of basic residues between the HTH domains of Rv0890c versus Rv1358 and Rv0386 was examined. Interestingly the HTH domain Rv0890c contained two additional positively charged residues over Rv1358 and Rv0386. Mutations of these abolished RNA-binding by Rv0890c. Thus the evolution of two basic residues permit Rv0890c to diverge in its nucleic acid-binding properties, a possible example of defunctionalisation following gene duplication. In summary, this thesis attempts to understand the evolution and functions of the STAND ACs, a group of pathogenically relevant and uniquely mycobacterial multidomain proteins. Phylogenetic analysis revealed an expansion of this gene family in slow growing mycobacteria. Biochemical characterisation showed that following gene duplication, the resulting proteins diverge both in their ability to synthesize cAMP and in their association with nucleic acids. Studies on these proteins also revealed novel mechanisms of regulation of mycobacterial cAMP levels. Additionally, these proteins exhibited indiscriminate binding to DNA/nucleic acids indicating that they may be responsible for global functions in the cell which extend beyond cAMP synthesis.