Structural studies on plasmodium falciparum triose phosphateisomerase and analysis of atomic displacement parameters in high resolution protein structures
Abstract
This thesis presents the investigations carried out by the candidate during the last 4 years on two rather independent but important aspects of macromolecular crystallography.
The first part of the thesis presents crystallographic investigations on triose phosphate isomerase from Plasmodium falciparum (PfTIM) complexed to several substrate analogues. These studies assume importance in light of increasing resistance of the protozoan parasite against commonly used drugs for combating malarial infection. The results illustrate several novel features of catalysis by PfTIM and suggest strategies that could be useful in the design of parasite-specific drugs.
The second part of the thesis describes statistical analyses on the atomic displacement parameters (ADPs or B-values) of high-resolution protein structures, carried out to obtain insights on different aspects of protein dynamics. This work, which has been published, was carried out two years ago. In the meantime, a large number of near atomic and atomic resolution structures have been deposited in the protein data bank, and the size of the data bank has increased substantially. This warrants re-examination and extension of the results presented here, although preliminary examination with a few atomic resolution structures confirms that the general conclusions presented in the thesis are valid. The thesis presents some new methods for B-value analysis and illustrates the type of results that could be obtained. Application of these methods for validation of reported B-values has also been explored.
The following paragraphs provide summaries of the work reported in the two parts of the thesis.
The first part of the thesis describes structural studies carried out on PfTIM in five Chapters. TIM catalyzes the isomerization of D-glyceraldehyde-3-phosphate (D-GAP) and dihydroxyacetone phosphate (DHAP). It is ubiquitous and a key enzyme in the glycolytic pathway. TIM has been the subject of extensive scientific work because of the simplicity of the reaction it mediates. Most studies have been performed with the aim of understanding the mechanism of its function. The deficiency of TIM is known to result in chronic anemia and neuromuscular impairment in humans. Protozoan parasites like Plasmodium depend heavily on glycolysis for their energy requirements, and hence, the enzymes of glycolytic pathways are attractive targets for the design of drugs for combating infections caused by these parasites. The three-dimensional structures of the enzyme from humans and trypanosomes have been studied with a view to designing structure-based drugs against sleeping sickness. With the emergence of strains of the malarial parasite resistant to chloroquine, there is renewed interest in the biochemistry of P. falciparum. Several protein engineering studies have also been performed on TIM since its fold represents one of the most favored protein folds. Chapter 1 provides a review of literature on TIM.
Structures of PfTIM complexed to two substrate analogues, 3-phosphoglycerate (3PG) and glycerol-3-phosphate (G3P), determined at 2.4Å resolution and comparison with the Trypanosomal counterparts, are presented in the 2nd Chapter. These structures were studied with a view to obtaining information on the interactions that might be important for the binding of the physiological substrates D-GAP and DHAP. The most dramatic feature of these structures is the observation of the "open" conformation of the loop, in contrast to the closed conformation observed in the corresponding complexes of enzymes from other sources. Loop closure is believed to be essential for modulating the environment of the active site such that the pKa of catalytic base Glu 165 and that of the ene-diol intermediate are tuned for efficient proton transfer. The closed conformation of the loop also ensures that the production of cytotoxic methylglyoxal is suppressed by holding the substrate in a conformation that is least suitable for phosphate elimination. The observed open conformation of the flexible loop of PfTIM, therefore, raises several questions regarding TIM catalysis. The detailed analysis of the active site and loop conformation of these complexes suggests that the unusual open conformation is a result of a steric clash that occurs between Phe96 and one of the loop residues in the closed conformation. Phe96 is a natural substitution found in PfTIM. This residue is Ser in TIM from most other sources. In both the structures, the extent of ligand binding to the two crystallographically independent subunits is different, indicating a possible asymmetry in ligand binding that has not been reported in solution.
Chapter 3 describes structures of PfTIM complexed to the transition state analogue, phosphoglycolate (PG), in two different space groups: an orthorhombic, P212121 form containing two dimers in the asymmetric unit, and a monoclinic C2 form in which the crystallographic and molecular twofold axes coincide, leading to a monomer in the asymmetric unit. These two structures provide five different and independent views of the geometry of the active site resulting from binding to a smaller ligand PG. The flexible loop is in the "open" conformation in all the four subunits of the P212121 form, although the binding of PG appears to be complete only in two of the four subunits. The most interesting finding is the "closed" conformation of the catalytic loop in the C2 form. Phe96, which prevents the closed conformation in P212121 form and in 3PG, G3P complexes, allows this dramatic change in conformation of the loop in the C2 form by occupying alternative conformations, both of which are different from its position in complexes with open loops. These structures represent the catalytic loop of a TIM trapped in both open and closed forms in the presence of the same ligand. Comparative analyses with the unbound PfTIM and yeast TIM-PG complexes are also presented.
The structure of PfTIM-2PG (2-phosphoglycerate), determined at near atomic resolution (1.1Å), is described in Chapter 4. This represents the highest resolution of TIM fold determined to date. Here also, the catalytic loop adopted the "open" conformation. The resolution of the structure permits a better understanding of the contrasting features of catalysis by PfTIM and enzymes from other sources. The most significant observation in this structure is the plausible chemical modification of the ligand in one of the subunits of the PfTIM-2PG complex. The fragmented electron density in this subunit was modeled as two molecular fragments corresponding to 2-oxo glycerate (2OG) and P03H. These fragments made chemical sense in terms of bonding to water molecules and the polypeptide. It is therefore possible that PfTIM possesses catalytic functions distinct from its established isomerization activity. In the other subunit, the electron density at the expected ligand site was continuous and appeared more flat at the C2 carbon when compared to the density corresponding to the C2 position of 2PG bound to Trypanosomal TIM. This unusual conformation of 2PG and hydrogen bonding pattern involving the flexible loop lead to interesting new insights on TIM catalysis.
The final Chapter of part 1 presents circular dichroism (CD) studies on thermal unfolding of PfTIM-ligand complexes and a synthesis of the structures of complexes described in earlier chapters. CD studies show that the substrate analogues 3PG, G3P, and 2PG do not affect Tm. However, the transition state analogues such as PG and phosphoglycohydroxamate (PGH) stabilize PfTIM against thermal denaturation. Implications of these results on the catalytic mechanism of PfTIM and their importance for structure-based drug design are discussed.
The second part of the thesis, presented in seven chapters, deals with statistical analyses carried out on B-values for understanding aspects of protein dynamics. The B-values obtained by high-resolution X-ray diffraction studies contain information on the dynamics of the molecules. Chapter 1 reviews earlier studies on B-values, different models of B-values, and procedures used for the refinement of protein structures. 95 different protein structures determined at a resolution better than 2.0Å and an R-factor of less than 20% with less than 25% similarity in their sequences formed the data for analyses.
It is shown in Chapter 2 that the frequency distributions of B-values expressed in units of standard deviation about their mean value (B'-factors) at the Ca atoms are characteristic of protein structures, irrespective of their size and function, although the actual B-values show large variations from one structure to another. The distribution was modeled as the summation of two Gaussian functions. The relation between the six parameters describing this function and different protein properties was analyzed. The propensity of amino acid residues to occur in flexible regions of the polypeptide chain was determined based on the ratio of the area under the two Gaussian functions.
Chapter 3 compares the B-values of the mesophilic and thermophilic protein structures in different ways to gain insights on the plausible reasons for the thermal stability of proteins from thermophiles. These studies revealed that the frequency distribution and increment in B-values from the center to the surface of the proteins are similar in the two sets. However, Ser and Thr show lesser flexibility in thermophiles compared to mesophiles. The most significant finding from these analyses is that the composition of Glu and Lys is higher and that of Ser and Thr is lower in high B-value regions of thermophilic proteins. Examples from literature supporting these findings are cited.
Comparative analyses of the ADPs in homologous proteins are presented in Chapter 4. These studies show that the flexible regions in the three-dimensional fold of proteins remain largely conserved during the course of evolution. In related proteins, the variation in the flexibility of a given segment is only weakly correlated to the variation of the amino acid sequence at the corresponding position. These results illustrate that the relationship between sequence and dynamics has degeneracy similar to that of sequence and three-dimensional structure.
In Chapter 5, flexibility of the protein molecule as expressed by B-values is correlated to different conformational parameters. It is shown that the ADPs of side chain atoms are lower for energetically favorable rotamers. The average ADP of a peptide unit depends weakly on Ramachandran angles at the corresponding Ca atom. Another finding is the differential variation of the flexibility induced by deviation from the planarity of the peptide bond. Those conformations with ? larger than the ideal trans geometry (180°–190°) are more flexible when compared to those with ? < 180° (170°–180°).
Chapter 6 discusses the correlation between the mean B-values of main chain and side chain atoms of protein structures in terms of correlation coefficients (CCs). Although the CCs were high for a large fraction of protein structures analyzed, they varied over a wide range (between 0.47 and 0.99). This variation does not appear to be related to the size of proteins or to the packing density in the crystals. The distribution of CCs shows dependence on the package used for refinement (X-PLOR, PROLSQ, or TNT) and is correlated to the size of the differences in the B’-factors of bonded atoms such as Ca and C? atoms. Further differences discernible in the CC distribution for proteins refined using the same package are probably related to the refinement protocols or weighting schemes followed by investigators. In general, the results presented in this chapter emphasize the need to evolve unique restraints and reliable methods for refining the B-values of the protein structures.
In the final chapter of part 2, potential uses of frequency distribution of B’-factors as a validation tool are examined. A validation tool proposed here is based on the frequency distribution of B’-values, which appears to be an invariant character independent of protein size, structure, and function. The values of parameters characterizing the distribution of carefully refined structures are also largely invariant. The validation tool is illustrated with examples: ferridoxin and Chloromuconate cycloisomerase, for which initially erroneous and subsequently corrected coordinates have been deposited in the PDB.
A part of the results has been reported in the following publications:
1. Analysis of temperature factor distribution in high-resolution protein structures (1997), S. Parthasarathy and M. R. N. Murthy, Protein Science, 6, 2561-2567. Erratum for the same in 1998, Protein Science, 7, 525.
2. Crystals of a thymidylate synthase mutant offer insights into crystal packing and plasticity of protein-protein contacts (1998), B. Gopal, V. Prasanna, S. Parthasarathy, D. V. Santi, P. Balaram, and M. R. N. Murthy, Current Science, 75, 299-304.
3. Disulfide engineering at the dimer interface of Lactobacillus casei thymidylate synthase: Crystal structure of the T155C/E188C/C244T mutant (1999), S. S. Velanker, R. S. K. et al

