Exploring Protein-Nucleic Acid Interactions Using Graph And Network Approaches

Sathyapriya, R

View/Open

G21686.pdf (14.76Mb)

Date

2009-09-22

Author

Sathyapriya, R

Metadata

Show full item record

Abstract

The flow of genetic information from genes to proteins is mediated through proteins which interact with the nucleic acids at several stages to successfully transmit the information from the nucleus to the cell cytoplasm. Unlike in the case of protein-protein interactions, the principles behind protein-nucleic acid interactions are still not very (Pabo and Nekludova, 2000) and efforts are still underway to arrive at the basic principles behind the specific recognition of nucleic acids by proteins (Prabakaran et al., 2006). This is mainly due to the innate complexity involved in recognition of nucleotides by proteins, where, even within a given family of DNA binding proteins, different modes of binding and recognition strategies are employed to suit their function (Luscomb et al., 2000). Such difficulties have also not made possible, a thorough classification of DNA/RNA binding proteins based on the mode of interaction as well as the specificity of recognition of the nucleotides. The availability of a large number of structures of protein-nucleic acids complexes (albeit lesser than the number of protein structures present in the PDB) in the past few decades has provided the knowledge-base for understanding the details behind their molecular mechanisms (Berman et al., 1992). Previously, studies have been carried out to characterize these interactions by analyzing specific non-covalent interactions such as hydrogen bonds, van der Walls, and hydrophobic interactions between a given amino acid and the nucleic acid (DNA, RNA) in a pair-wise manner, or through the analysis of interface areas of the protein-nucleic acid complexes (Nadassy et al., 1998; Jones et al., 1999). Though the studies have deciphered the common pairing preferences of a particular amino acid with a given nucleotide of DNA or RNA, there is little room for understanding these specificities in the context of spatial interactions at a global level from the protein-nucleic acid complexes. The representation of the amino acids and the nucleotides as components of graphs, and trying to explore the nature of the interactions at a level higher than exploring the individual pair-wise interactions, could provide greater details about the nature of these interactions and their specificity. This thesis reports the study of protein-nucleic interactions using graph and network based approaches. The evaluation of the parameters for characterizing protein-nucleic acid graphs have been carried out for the first time and these parameters have been successfully employed to capture biologically important non-covalent interactions as clusters of interacting amino acids and nucleotides from different protein-DNA and protein-RNA complexes. Graph and network based approaches are well established in the field of protein structure analysis for analyzing protein structure, stability and function (Kannan and Vishveshwara, 1999; Brinda and Vishveshwara, 2005). However, the use of graph and network principles for analyzing structures of protein-nucleic acid complexes is so far not accomplished and is being reported the first time in this thesis. The matter embodied in the thesis is presented as ten chapters. Chapter 1 lays the foundation for the study, surveying relevant literature from the field. Chapter 2 describes in detail the methods used in constructing graphs and networks from protein-nucleic acid complexes. Initially, only protein structure graphs and networks are constructed from proteins known to interact with specific DNA or RNA, and inferences with regard to nucleic acid binding and recognition were indirectly obtained . Subsequently, parameters were evaluated for representing both the interacting amino acids and the nucleotides as components of graphs and a direct evaluation of protein-DNA and Protein-RNA interactions as graphs has been carried out. Chapter 3 and 4 discuss the graph and network approaches applied to proteins from a dataset of DNA binding proteins complexed with DNA. In chapter 3, the protein structure graphs were constructed on the basis of the non-covalent interactions existing between the side chains of amino acids. Clusters of interacting side chains from the graphs were obtained using the graph spectral method. The clusters from the protein-DNA interface were analyzed in detail for the interaction geometry and biological importance (Sathyapriya and Vishveshwara, 2004). Chapter 4 also uses the same dataset of DNA binding proteins, but a network-based approach is presented. From the analysis of the protein structure networks from these DNA binding proteins, interesting observations relating the presence of highly connected nodes(or hubs) of the network to functionally important amino acids in the structure, emerged. Also, the comparison between the hubs identified from the protein-protein and the protein-DNA interfaces in terms of their amino acid composition and their connectivity are also presented (Sathyapriya and Vishveshwara, 2006) Chapter 5 and 6 deal with the graph and network applications to a specific system of protein-RNA complex (aminoacyl-tRNA synthetases) to gain insights into their interface biology based on amino acid connectivity. Chapter 5 deals with a dataset of aminoacyl-tRNA synthetase (aaRS) complexes obtained with various ligands like ATP, tRNA and L-amino acids. A graph based identification of side chain clusters from these ligand-bound aaRS structures has highlighted important features of ligand-binding at the catalytic sites of the two structurally different classes of aaRS (Class I and Class II). Side chain clusters from other regions of aaRS such as the anticodon binding region and the ligand-activation sites are discussed. A network approach is used in a specific system of aaRS(E.coli Glutaminyl-tRNA synthetase (GlnRS) complexed with its ligands, to specifically understand the effects of different ligand binding., in chapter 6. The structure networks of E.coli GlnRS in the ligand-free and different ligand-bound states are constructed. The ligand-free and the ligand-bound complexes are compared by analyzing their network properties and the presence of hubs to understand the effect of ligand-binding. These properties have elegantly captured the effects of ligand-binding to the GlnRS structure and have also provided an alternate method for comparing three dimensional structures of proteins in different ligand-bound states (Sathyapriya and Vishveshwara, 2007). In contrast to protein structure graphs (PSG), both the interacting amino acids and nucleotides (DNA/RNA) form the components of the protein-nucleic acid graphs (PNG) from protein-nucleic acid complexes. These graphs are constructed based on the non-covalent interactions existing between the side chains of the amino acids and nucleotides. After representing the interacting nucleotides and amino acids as graphs, clusters of the interacting components are identified. These clusters are the strongly interacting amino acids and nucleotides from the protein-nucleic acid complexes. These clusters can be generated at different strengths of interaction between the amino acid side chain and the nucleotide (measured in terms of its atomic connectivity) and can be used for detecting clusters of non-specific as well as specific interactions of amino acids and nucleotides. Though the methodology of graph construction and cluster identification are given in chapter 2, the details of the parameters evaluated for constructing PNG are given in chapter 7. Unlike in the previous chapters, the succeeding chapters deal exclusively with results that are obtained from the analyses of PNG. Two examples of obtaining clusters from a PNG are given, one each for a protein-DNA and a protein-RNA complex. In the first example, a nucleosome core particle is subjected to the graph based analysis and different clusters of amino acids with different regions of the DNA chain such as phosphate, deoxyribose sugar and the base are identified. Another example of aminoacyl-tRNA synthetase complexed with its cognate tRNA is used to illustrate the method with a protein-RNA complex. Further, the method of constructing and analyzing protein-nucleic acid graphs has been applied to the macromolecular machinery of the pre-translocation complex of the T. thermophilus 70S ribosome. Chapter 8 deals exclusively with the results identified from the analysis of this magnificent macromolecular ensemble. The availability of the method that can handle interactions between both amino acids and the nucleotides of the protein-nucleic acid complexes has given us the basis fro evaluating these interactions in a level higher than that of analyzing pair-wise interactions. A study on the evaluation of short hydrogen bonds(SHB) in proteins, which does not fall under the realm of the main objective of the thesis, is discussed in the Chapter 9. The short hydrogen bonds, defined by the geometrical distance and angle parameters, are identified from a non-redundant dataset of proteins. The insights into their occurrence, amino acid composition and secondary structural preferences are discussed. The SHB are present in distinct regions of protein three-dimensional structures, such that they mediate specific geometrical constraints that are necessary for stability of the structure (Sathyapriya and Vishveshwara, 2005). The significant conclusions of various studies carried out are summarized in the last chapter (Chapter 10). In conclusion, this thesis reports the analyses performed with protein-nucleic acid complexes using graph and network based methods. The parameters necessary for representing both amino acids and the nucleotides as components of a graph, are evaluated for the first time and can be used subsequently for other analyses. More importantly, the use of graph-based methods has resulted in considering the interaction between the amino acids and the nucleotides at a global level with respect to their topology of the protein-nucleic acid complexes. Such studies performed on a wide variety of protein-nucleic acid complexes could provide more insights into the details of protein-nucleic acid recognition mechanisms. The results of these studies can be used for rational design of experimental mutations that ascertain the structure-function relationships in proteins and protein-nucleic acid complexes.

URI

https://etd.iisc.ac.in/handle/2005/624

Collections

Molecular Biophysics Unit (MBU) [333]