Show simple item record

dc.contributor.advisorSrinivasan, N
dc.contributor.authorKumar, Gayatri
dc.date.accessioned2021-03-12T06:46:51Z
dc.date.available2021-03-12T06:46:51Z
dc.date.submitted2019
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/4968
dc.description.abstractThe advent of high fidelity protein sequencing techniques has led to a considerable wealth of sequence data. However, the number of proteins with information on 3-D structure and functional features available is considerably lower. In spite of improvements in structural and functional genomics initiatives, most experimental procedures in use are time consuming. This has led to a formidable gap between the sequence and structure space which continues to increase. The structural coverage of the proteome of most organisms is not complete and limits the information available on function and the implied biological roles. Computational approaches could provide preliminary ideas on the structure and function of proteins. Protein structures are far more conserved than sequences as a consequence of the evolutionary pressure to maintain the structure and thereby its function. Therefore, recognition of evolutionary relationships among proteins could serve as an important step towards inferences on shared structural and functional features between related proteins. Detailed comparative analysis of evolutionarily related proteins could provide clues to protein structure and consequently its function. However, a notorious problem is detection of relationship between proteins characterized by low sequence similarity (less than about 20%) as unrelated proteins too share poor sequence similarity. The detection of relatedness between sequentially distant proteins serves as a nodal point in structure and function recognition. Hence, most sequence search algorithms rely on deriving these non-trivial relationships between distant homologues to further functional annotation. It has been observed that the limitation in identifying distant relatives is due to the sparseness of the protein sequence space. i.e., if sequences intermediately related to the two proteins (or two protein families) are unavailable, then the recognition of such relationships purely using sequence data becomes challenging. The paucity of natural intermediate sequences to direct profile or sequence search methods undermines even rigorous and powerful search algorithms. In a protocol developed earlier in the group, protein-like sequences, referred as offsprings, were computationally designed using the sequence profiles of domain family pairs, referred as parents, which are known to be distantly related. It has been shown that these sequences served as stepping stones for search methods to link distant relatives. Plugging these intermediately related sequences, into the database of natural protein sequences addressed the challenges of the void and sparse regions of the protein sequence space. Use of designed sequences showed a marked improvement in structural fold coverage and augmented the ability of search protocols. Therefore, use of designed sequences in homology detection could enable recognition of structure and function of proteins not known so far. The questions raised in this thesis starts with exploring the foldability of the designed sequences into the parent structural fold. Having seen that these designed proteins are likely to adopt the structural fold of parent families, they were employed in recognizing the structure of protein families which do not possess any information on structure yet. Further, an improvement in the approach was put forth to make homology driven searches faster and more sensitive by representing the sequences, both natural and designed, as hidden Markov models. The use of intermediately related artificial sequences in probing functional relationships between protein families was explored. The associations made through designed sequences were examined for identifying biological relevance by exploring the conservation of putative functional residues. To strengthen the ability of the designed intermediates in homology detection, the artificial expansion of the protein space around protein families was carried out.en_US
dc.language.isoen_USen_US
dc.relation.ispartofseries;G29791
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertationen_US
dc.subjectProtein structuresen_US
dc.subjectMarkov modelsen_US
dc.subject.classificationResearch Subject Categories::NATURAL SCIENCES::Biology::Other biologyen_US
dc.titleUse of strategically designed protein-like sequences in structure and function recognitionen_US
dc.typeThesisen_US
dc.degree.namePhDen_US
dc.degree.levelDoctoralen_US
dc.degree.grantorIndian Institute of Scienceen_US
dc.degree.disciplineFaculty of Scienceen_US


Files in this item

This item appears in the following Collection(s)

Show simple item record