Development and application of sequence-based approaches for recognition and functional characterization of protein kinases
Abstract
Protein kinases, the third most populous protein family, are a major class of enzymes that regulate a wide range of cellular processes by phosphorylating multiple cellular proteins. There are many conserved sequence motifs within the catalytic domain that are essential for the regulation of the proteins, and the catalytic core is commonly shared across all typical protein kinases.
In this thesis, we present identification and comprehensive analyses of Ser/Thr or Tyr (STY) protein kinases encoded in many pathogenicity causing organisms. The role of kinases in causing pathogenicity has been elucidated in many of the previous studies. With the emergence of diverse pathogenic species in the last few decades, the protein kinases of the pathogens and the respective hosts might have evolved to adapt to the hostile environment and tolerate stress response. Hence, it is essential to even perform a comparative study of kinases within pathogens against those encoded in their host species.
Protein kinases can be identified using sequence-based approaches. However, some of the kinase subfamilies have highly diversified, and one requires sensitive homology detection approaches for their identification. Therefore, we developed a fresh protocol named "Master Blaster" for the detection of distantly related proteins. The performance of Master Blaster is evaluated by comparing against widely-used profile based and hidden Markov model-based methods. An improvement in fold coverage using Master Blaster is reported. We also used artificially designed sequences for detecting distantly related proteins using Master Blaster. Use of the designed sequences was found to be useful in connecting protein families that are highly diverged at sequence level.
We applied the developed protocol to identify protein kinases encoded in Candida albicans and performed an extensive analysis of these kinases using sequence-based methods. A comparative study of kinases within C. albicans, pathogenic and non-pathogenic non-albicans Candida species, Baker's yeast and Human was performed. This study has resulted in identifying organism-specific kinases and signature motifs in kinases of pathogenicity causing Candida species. Domain architectures of kinases are also found to be organism-specific. Some of the protein kinases within industrial relevant Candida species such as Pseudozyma antarctica are found to be similar to those encoded in human.
Kinases within each subfamily can recognize and phosphorylate multiple substrates and the substrates that are recruited could be specific to a particular kinase subfamily based on their functional role. Phosphorylation of specific substrates is possible due to the conserved substrate binding residues within in each kinase subfamily. Most of the kinase classification schemes depends on the
conservation within the catalytic domain region and does not take into account the conservation within the substrate binding regions. This could lead to misclassification of some of the kinases. Hence, we proposed a new approach to classify kinases based on conservation of their substrate binding residues. Using the proposed scheme, we could identify signature regions within some of the kinase subfamilies and these regions can be used for the re-classification of protein kinases.
In this thesis work, we also identified kinases encoded in viral genomes using rigorous sequence-based methods and reported the sequence similarities and differences between kinases within various viral genomes. Kinases are detected in viruses having double stranded DNA genome and in some of the subfamilies of retroviruses. In none of the host infecting viruses, kinases could be detected. The putative protein kinases from giant viruses are found to be similar to the retroviral oncogenic kinases. The substrate binding regions within some of the viral kinases are identified using the new classification scheme.
This work can be extended by applying developed protocols for studying protein kinases encoded in different organisms.