| dc.description.abstract | Accessibility has long been used to measure the burial of protein residues. However, this procedure fails to precisely indicate how deep a residue lies from the surface, because two residues with the same accessibility can be at different distances from the surface. Conversely, two residues with similar depths can also differ in accessibility. We have developed a unique and independent method to define residue burial using an index called depth.
This thesis studies various methods of proteinresidue depth calculation and applications of depth in the analysis of protein structure. Chapter I deals with calculations and applications of proteinresidue depths. The depth of a protein atom is defined as its distance from the nearest surface water. The depth procedure requires the protein molecule to be solvated in a 111 Å cubic box of water equilibrated by Monte Carlo simulation. The protein is then rotated about its centre of mass by a random angle, followed by a translation along the Xaxis by a random number ±d (d < 2.8 Å, the latticepoint separation of water). Any water molecule that clashes with a protein atom is removed to eliminate internal waters. The depth of each atom is then calculated as the distance to the nearest surface water.
The rotation, translation, removal of clashing water, and depth calculation are repeated multiple times to mimic molecular dynamics in solution. The iterative process continues until convergence in atom depths is achieved. The final depth of an atom is the average over all iterations, and the average depth of its constituent atoms is assigned to each residue. Residues with depth > 6.5 Å are clustered into distinct hydrophobic cores if no atom of one cluster lies within 5 Å of another.
For any protein atom, the average nearest surfacewater distance can also be calculated using structures extracted from moleculardynamics (MD) simulations of a solvated protein. By definition, this average distance is depth. Our procedure is, however, computationally less demanding. Connolly’s dotsurface algorithm produces surface dots on proteins by rolling a probe sphere of a specified radius over the surface. The distance of the nearest dot from an atom can also serve as its depth, though this method is less accurate than ours.
It is possible to locate distinct, spatially separable regions in proteins that correspond to structural domains. Several domainfinding algorithms have been developed in the past decade, many relying solely on C information and ignoring sidechain details. Other methods assemble domains from regular secondarystructure fragments and fail to recognise loops or irregular structures. We use depth calculations to locate hydrophobic cores in proteins. If a domain is a region of protein capable of independent folding, it should contain a hydrophobic core large enough to stabilise its structure. Thus, clustering atoms into domains is crucial to understanding proteinfolding events.
Experimental techniques have probed early stages of protein folding, some tracking kinetic pathways and others identifying thermodynamically stable intermediates under a variety of conditions. It is important to correlate such experimental observations with hydrophobic clusters to better understand folding pathways.
In 1908, Voronoi proposed a mathematical method to divide space among an ensemble of points, assigning a unique volume to each. This involves constructing perpendicular bisectors of lines joining each point to all others; the smallest polyhedron thus generated defines the point’s volume. This approach can be applied to proteins by treating each atom as a point. Although rigorous, it lacks physicochemical precision. Richards addressed this by dividing each interatomic vector in the ratio of van der Waals radii, but his method fails to assign volumes to peripheral atoms. We developed a simple method for calculating volumes of all atoms-including surface atoms-by using a solvated protein. Rotation and translation in a water box mimic solution dynamics, and the sum of atomic volumes yields the residue volume. These studies are discussed in Chapter I. Volumes from our procedure closely match those from MDsolvated structures.
Chapter II: Aspects of Globular Protein Stability
The stability of globular proteins is temperaturedependent and characterised by three temperatures:
Colddenaturation temperature (Tdc)
Temperature of maximal stability (Tms)
Heatdenaturation temperature (Tdh)
The heatcapacity change (Cp) during unfolding can be used to predict Tdc and Tms, and Cp can be estimated from protein sequence alone. This has been established for mesophilic proteins (organisms thriving at 20-40°C). We are extending this analysis to proteins from thermophilic (60-80°C) and hyperthermophilic (~100°C) organisms. These unusually stable proteins provide important opportunities for studying the factors underlying thermal stability, with implications for biotechnology.
Largescale genomesequencing projects are rapidly expanding proteinsequence databases. We examined sequencebased factors responsible for thermostability in organisms such as E. coli, Haemophilus influenzae, Saccharomyces cerevisiae, Methanococcus jannaschii, Methanobacter thermoautotrophicum, and Archeoglobus fulgidus. Our analysis focused on potential globular proteins identified using the Kyte-Doolittle hydropathy index. We found that thermophilic proteins tend to have:
smaller modal sequence lengths
a higher percentage of residues in extended sheet conformation (Predator algorithm)
more salt bridges per helix
No significant differences in overall aminoacid composition were observed. Future work includes comparing domains of proteins from the two groups and analysing available thermodynamic parameters of singledomain thermophilic proteins to predict Tms. | |