NMR Solution Structures of Human γC-Crystallin & the Intrinsically Disordered Viral Genome Linked Protein in the Free & Bound Form
This thesis describes the tertiary structures and dynamic studies of two protein systems. The first is human γC -crystallin protein, which is present in the nucleus of the human eye lens and the other is the plant viral protein VPg (an intrinsically disordered protein) in its free as well as its protease bound forms. The structural studies described here have been carried out using high-resolution solution NMR spectroscopic methods. Project I: Determination of solution structure and dynamics of Human γC-crystallin (HGC) using NMR spectroscopy The crystallins are the most abundant proteins in the eye lens of vertebrates. These proteins are packed in short-range spatial order to provide the transparency and appropriate refractive index gradient that are required for vision. The crystallins belong to two gene families, which are categorized as the alpha and beta/gamma crystallins respectively. The classification on the basis of molecular size and structure results in the proteins being referred to as alpha, beta and gamma crystallins. Again, each of the crystallins has two or more subtypes. The stoichiometry of the subtypes of α, β and γ crystallins varies with the age of the organism, but the order of abundance remains as β > α > γ irrespective of age. The most abundant crystallins in the nucleus (central region) of eye lens are the γ -crystallins. In the human lens, only three members of the γ− crystallin family are mainly expressed i.e. γS- (HGS), γC - (HGC) and γD - (HGD). HGS is expressed postnatally and thus is present mainly in the cortical region of the lens unlike HGC and HGD crystallins, which are present in the nucleus. It is known that aging and some cataract-associated genetic mutations alter the structure of these proteins. Other point mutations result in minimum structural perturbation but with drastically lowered solubility. Mutation in the human γC -crystallin leads to congenital cataract such as Coppock-like cataract, while structural information is available for HGD & HGS but no structure is available for HGC. However, recently a model structure has been reported for HGC based on a mouse orthologous. Based on this model structure, it was argued that HGC is an insoluble protein and was explained by lower magnitude of dipole moment and fluctuation in N-terminal domain of the model structure. However it is shown that HGC is very soluble protein. Solution structure of human γC-crystallin has been determined from an analysis of multidimensional triple resonance NMR spectroscopy using distance restraints from unambiguously assigned 1H-1H NOE peaks and dihedral angle restraints from HNHA and HNHB spectra. 15N relaxation average T1 and T2 correspond to 0.729 ± 0.02 and 0.060 ± 0.04 second from 15N backbone relaxation study, which gives average rotational correlation time 10.87 ns that shows human γC-crystallin is monomer in solution of molecular weight 21 kDa (173 residues). The ensemble of 20 lowest energy structures shows a root mean square deviation of 0.60 ± 0.12 Å for the backbone atoms, and 1.03 ± 0.09 Å for all heavy atoms. The comparison between the calculated NMR structure with backbone chain atoms C`, Cα and NH, of the x-ray crystal structure of the mouse γC - crystallin shows that the structure determined here of human γC-crystallin is very similar with an RMSD of 1.3 Å, which is not surprising given the 84.5% amino acid sequence identity between the two proteins. More importantly, the NMR structure reported here shows the subtle differences in the orientation of specific residues as well as the domain interface between the human and mouse orthologs. The orientation of the calculated dipole moment for this NMR structure differs from earlier reported for model structure. However it is similar to the other known soluble proteins. The determined solution structure of human γC-crystallin also enables us to estimate the effect of cataract-associative mutations on the structure and properties of the protein. Several such mutations are already known, and the work presented here could likely shed light on the molecular basis of these cataracts. Project II: Solution structural studies of intrinsically disordered protein VPg in free and bound forms from Sesbania mosaic virus Sesbania mosaic virus (SeMV) is a plant virus, which infects the Sesbania grandiflora tree. SeMV belongs to Sobemovirus genus, which is not defined under any family. The length of this viral genome is ~4kb. This viral genome has four open-reading frames (ORF). ORF1 and ORF2 encode movement and coat proteins, respectively. ORF2 is again split into two ORFs i.e. ORF2a and ORF2b by a -1 shift in the reading frame and encode two polypeptide chains. These polypeptide chains generate several functional proteins upon polyprotein processing. Polyprotein processing is a mechanism employed by animal and plant viruses to produce several functional proteins from a single polypeptide chain. The two polyproteins expressed are catalytically cleaved by a serine protease, thus releasing the four proteins: VPg (viral protein genome linked), RdRP (RNA dependent RNA polymerase), P10, and P8. VPg (“Viral Protein genome linked”) as its name suggests, is covalently linked to the 5` end of the viral RNA. VPgs are generally known to be intrinsically disordered proteins and have many interacting partners. Intrinsically Disordered Proteins (IDPs) are not explained by the 3D structure–function dogma. However, they are important for biological functions such as molecular recognition, signal transduction and regulation. It is known that SeMV protease becomes inactive in the absence of the VPg domain at its C-terminal. VPgs of animal viruses are well studied as compared to VPgs of plant virues. The size of VPg varies across the Sobemovirus genus. It is important to know the structure of VPg since it is necessary for protease activity. The studies conducted here focus on the structural analysis of the VPg in its free and bound forms with protease (VPg complex) as well as some aspect of full-length ProVPg. For structural studies, two constructs of VPg as fusion protein with Cytb5 tag, one lacking 23 residues at its C-terminal using the pET21a(+) plasmid vector have been designed. Sub-cloning was also done to add a thrombin recognition site to remove the hexa-His tag from new constructs of full-length ProVPg and protease (PRO). These proteins were highly expressed, isotopically labeled and purified for NMR study. The sample used for structural studies of the ProVPg 23 complex was prepared using selectively protonated Ile, Leu and Val; and isotopically labeled i.e. 2H, 13C, and 15N-VPg 23 protein. VPg in its free form is an intrinsically disordered protein and this has been confirmed by its dynamic nature observed using solution NMR spectroscopy. VPg binds to its partner protease and adopts a 3D-structure, which has been shown here. The tertiary structure has been determined using distance restraints from 1HN-1HN NOEs and methyl 1HN NOEs, and dihedral angle predicted from analysis of chemical shift values. The tertiary structure of ProVPg 23 complex has one β -sheet composed of three antiparallel β-strands and an α-helix. The ensemble of 20 lowest energy structures shows a root mean square deviation of 0.42 ± 0.09 Å for the backbone atoms, and 1.09 ± 0.11 Å for all heavy atoms for residues 15 to 50 that are primarily involved in structure formation. On the other hand RMSD is 2.34 ± 0.72 Å for the backbone and 2.55 ± 0.60 Å for all heavy atoms for all residues including both termini. That the tertiary fold of VPg both in full-length ProVPg and when complexed with protease domain (PRO) are the same has been shown here. The NMR structure reported here provides a structural basis for the origin of resonances in the up-field region of one–dimensional proton spectrum of full length ProVPg. The binding surface based on the structures of ProVPg 23 complex determined here and X-ray structure of PRO; has been determined using HADDOCK. The structural model here of full length ProVPg 23 shows the presence of aromatic interaction between Trp271 of PRO and Trp46 of VPg, which is consistent with the earlier biochemical studies.