dc.description.abstract | The unique native structure is a basic requirement for normal functioning of most proteins. Many diseases stem from mutations in proteins that destabilize the protein structure thereby resulting in impairment or loss of function (Sunyaev et al. 2000). Therefore, it is important from both fundamental and applied points of view, to elucidate the sequence determinants of protein structure and function. With the advent of recombinant DNA techniques for modifying protein sequences, studies on the effect of amino acid replacements on protein structure and function have acquired momentum. It is well established from previous mutagenesis studies that buried residues in a protein are important determinants of protein structure or stability while surface residues are involved in protein function (Rennell et al. 1991; Terwilliger et al. 1994; Axe et al. 1998). Inspite of this, there is no universally accepted definition and probe to distinguish and identify buried residues from exposed residues. A part of this thesis aims to examine the feasibility of using scanning mutagenesis to distinguish between buried and exposed positions in the absence of three-dimensional structure and also to arrive at an experimental definition of the appropriate accessibility cut-off to distinguish between buried and exposed residues. Proline, being an unusual amino acid is usually exploited to determine sites in a protein important for protein stability (Sauer et al. 1992). This thesis also explores the use of proline scanning mutagenesis to make inferences about protein structure and stability.
Temperature sensitive mutant proteins, which result from single amino acid substitutions, are particularly useful in elucidating the determinants of protein folding and stability (Grutter et al. 1987; Sturtevant et al. 1989). Temperature sensitive (ts) mutants are an important class of conditional mutants which are widely used to study gene function in vivo and in cell culture (Novick and Schekman 1979; Novick and Botstein 1985). They display a marked drop in the level or activity of the gene product when the gene is expressed above a certain temperature (restrictive temperature). Below this temperature (permissive temperature), the level or activity of the mutant is very similar to that of the wild type. Inspite of their widespread use, little is known about the molecular mechanisms responsible for generating a Ts phenotype. A part of this thesis discusses a set of sequence/structure-based strategies for the successful design and isolation of ts mutants of a globular protein, inferred from saturation mutagenesis of CcdB.
The experimental system, CcdB (Controller of Cell Division or Death B protein), is a 101 residue, homodimeric protein encoded by F plasmid. The protein is an inhibitor of DNA gyrase and is a potent cytotoxin in E.coli (Bernard et al. 1993). Crystallographic structures of CcdB in the free and gyrase bound forms (Loris et al. 1999; Dao-Thi et al. 2005) are also available. Expression of the CcdB functional protein results in cell death, thus providing a rapid and easy assay for the protein (Chakshusmathi et al. 2004).
This dissertation focuses on understanding the determinants of globular protein stability and temperature sensitivity using saturation mutagenesis of E.coli CcdB. Towards this objective, we attempted to replace each of the 101 residues of CcdB with 19 other amino acids using high throughput mutagenesis tools. A total of 1430 (~75%) of all possible single site mutants of the CcdB saturation mutagenesis library could be isolated. These mutants were characterized in terms of their activity at different expression levels. The correlation between the observed mutant phenotypes with residue burial, nature of substitution and expression level was examined.
The introductory chapter (Chapter 1) describes the use of mutagenesis as a tool to understand the relationship between protein sequence, structure and function. It represents an overview of previous large scale mutagenesis studies from the literature. It also addresses the motivation behind this work and problems which we have attempted to address in these studies.
Chapter 2 discusses mutagenesis based definitions and probes for residue burial in proteins as derived from alanine and charged scanning mutagenesis of CcdB. Every residue of the 101 amino acid E. coli toxin CcdB was substituted with Ala, Asp, Glu, Lys and Arg using site directed mutagenesis. The activity of each mutant in vivo was characterized as a function of CcdB transcriptional level. The mutation data suggest that an accessibility value of 5% is an appropriate cutoff for definition of buried residues. At all buried positions, introduction of Asp results in an inactive phenotype at all CcdB transcriptional levels. The average amount of destabilization upon substitution at buried positions decreases in the order Asp>Glu>Lys>Arg>Ala. Asp substitutions at buried sites in two other proteins, MBP and Thioredoxin were also shown to be severely destabilizing. Ala and Asp scanning mutagenesis, in combination with dose dependent expression phenotypes, was shown to yield important information on protein structure and activity. These results also suggest that such scanning mutagenesis data can be used to rank order sequence alignments and their corresponding homology models, as well as to distinguish between correct and incorrect structural alignments.
When incorporated into a polypeptide chain, Proline (Pro) differs from all other naturally occurring amino acids in two important respects. The dihedral angle of Pro is constrained to values close to –65o and Pro lacks an amide hydrogen. Chapter 3 describes a procedure to accurately predict the effects of proline introduction on protein stability. 77 of the 97 non-Pro amino acid residues in the model protein, CcdB, were individually mutated to proline and the in vivo activity of each mutant was characterized. A decision tree to classify the mutation as perturbing or non-perturbing was created by correlating stereochemical properties of mutants to activity data. The stereochemical properties, including main chain dihederal angle and main chain amide hydrogen bonds, were determined from 3D models of the mutant proteins built using MODELLER. The performance of the decision tree was assessed on 74 nsSNPs and 37 other proline substitutions from the literature. The overall accuracy of this algorithm was found to be 89% in case of CcdB, 71% in case of nsSNPs and 83% in case of other proline substitution data. Contrary to previous assertions, Proline scanning mutagenesis cannot be reliably used to make secondary structural assignments in proteins. The studies will be useful in annotating uncharacterized nsSNPs of disease-associated proteins and for protein engineering and design.
Mutants of CcdB were also characterized in terms of their activity at two different temperatures (30oC and 37oC) to screen for temperature sensitive (ts) mutants. The isolation and structural analysis of Ts mutants of CcdB is dealt with in Chapter 4. Of the total 1430 single site mutants, 12% showed a ts phenotype and were mapped onto the crystal structure of the protein. Almost all the ts mutants could be interpreted in terms of the wild type, native structure. ts mutants were found at all buried sites and all active sites (except one). ts mutants were also obtained at sites in close proximity to active site residues where polar side-chains were involved in H-bonding interaction with active site residues. Several proline substitutions also displayed a ts phenotype. The effect of expression level on ts phenotype was also studied. 78% of the mutants that showed an inactive phenotype at the lowest expression level and an active phenotype at highest expression level, resulted in a ts phenotype at an intermediate expression level. The molecular determinant responsible for the ts phenotype of buried site ts mutant is suggested to be the thermodynamic destabilization of the protein which results in a reduced steady state in vivo level of soluble, functional protein relative to wild type. The active site ts mutants probably lower the specific activity of the protein and hence the total activity relative to wild type. However these effects might be less severe at lower temperature. Specific structure/function based mutagenesis strategies are suggested to design ts mutant of a protein. These studies will simplify the design of ts mutants for any globular protein and will have applications in diverse biological systems to study gene function in vivo.
Chapter 5 represents the structural and sequence correlations of a CcdB saturation mutagenesis library which was obtained by replacing each of 101 amino acid residues with 19 other amino acids. Polar substitutions i.e. Asn, Gln, Ser, Thr and His were poorly tolerated at buried sites at lower expression levels. Aromatic substitutions and Gly were also not well tolerated at buried positions at lower expression levels. Trp was poorly tolerated at residues with accessibility <15%. However, most of the surface exposed residues with accessibility >40% (except functional ones) could tolerate all kinds of substitutions. Chapter 6 deals with the thermodynamic characterization of monomeric and dimeric forms of CcdB. The stability and aggregation state of CcdB have been characterized as a function of pH and temperature. Size exclusion chromatography revealed that the protein is a dimer at pH 7.0, but a monomer at pH 4.0. CD analysis and fluorescence spectroscopy showed that the monomer is well folded, and has similar tertiary structure to the dimer. Hence intersubunit interactions are not required for folding of individual subunits. The oligomeric status of CcdB at pH 7.0 at physiologically relevant low concentrations of protein, was characterized by labeling the protein with two different pairs of donor and acceptor fluorescent dyes (Acrylodan-Pyrene and IAF-IAEDANS) separately and carrying out fluorescence resonance energy transfer (FRET) measurements by mixing them together. CcdB exists in a dimeric state even at nanomolar concentrations, thus indicating that the dimeric form is likely to be the physiologically active form of CcdB. The stability of the dimeric form at pH 7.0 and the monomeric form at pH 4.0 was characterized by isothermal denaturant unfolding and calorimetry. The free energies of unfolding were found to be 9.2 kcal/mol (1 cal=4.184 J) and 21 kcal/mol at 298 K for the monomer and dimer respectively. The denaturant concentration at which one-half of the protein molecules are unfolded (Cm) for the dimer is dependent on protein concentration, whereas the Cm of the monomer is independent of protein concentration, as expected. Although thermal unfolding of the protein in aqueous solution is irreversible at neutral pH, it was found that thermal unfolding is reversible in the presence of GdnCl (guanidinium chloride). Differential scanning calorimetry in the presence of low concentrations of GdnCl in combination with isothermal denaturation melts as a function of temperature were used to derive the stability curve for the protein. The value of Cp (representing the change in excess heat capacity upon protein denaturation) is 2.8 ± 0.2 kcalmol-1K-1 for unfolding of dimeric CcdB, and only has a weak dependence on denaturant concentration. These studies advanced the understanding of protein folding of oligomeric proteins.
The concluding section summarizes all the chapters in a nutshell and addresses the future directions provided by these investigations. | en_US |