Analysis of Molecular Dynamics Trajectories of Proteins Performed using Different Forcefields and Identifiction of Mobile Segments
Katagi, Gurunath M
MetadataShow full item record
The selection of the forcefield is a crucial issue in any MD related work and there is no clear indication as to which of the many available forcefields is the best for protein analysis. Many recent literature surveys indicate that MD work may be hindered by two limitations, namely conformational sampling and forcefields used (inaccuracies in the potential energy function may bias the simulation toward incorrect conformations). However, the advances in computing infrastructures, theoretical and computing aspects of MD have paved the way to carry out a sampling on a sufficiently longtime scale, putting a need for the accuracies in the forcefield. Because there are established differences in MD results when using forcefields, we have sought to ask how we could assess common mobility segments from a protein by analysis of trajectories using three forcefields in a similar environment. This is important because, disparate fluctuations appear to be more at flexible regions compared to stiff regions; in particular, flexible regions are more relevant to functional activities of the protein molecule. Therefore, we have tried to assess the similarity in the dynamics using three well-known forcefields ENCAD, CHARMM27 and AMBERFF99SB for 61 monomeric proteins and identify the properties of dynamic residues, which may be important for function. The comparison of popular forcefields with different parameterization philosophy may give hints to improve some of the currently existing agnostics in forcefields and characterization of mobile regions based on dynamics of proteins with diverse folds. These may also give some signature on the proteins at the level of dynamics in relation to function, which can be used in protein engineering studies. Nanosecond level MD simulation(30ns) on 61 monomeric proteins were carried out using CHARMM and AMBER forcefields and the trajectories with ENCAD forcefield obtained from Dynameomics database. The trajectories were first analyzed to check whether structural and dynamic properties from the three forcefields similar choosing few parameters in each case. The gross dynamic properties calculated (root mean square deviation (RMSD), TM-score derived RMSD, radius of gyration and accessible surface area) indicated similarity in many proteins. Flexibility index analysis on 17 proteins, which showed a notable difference in the flexibility, indicated that tertiary interactions (fraction of nonnative stable hydrogen bonds and salt bridges) might be responsible for the difference in the flexibility index. The normalized subspace overlap and shape overlap score taken based on the covariance matrices derived from trajectories indicated that majority of the proteins show a range between 0.3-0.5 indicating that the first principal components from these proteins in different combinations may not match well. These results indicate that although dynamic properties in general are similar in many proteins. However, flexibility index and normalized subspace overlap score indicate that subspaces on the first principal component in many proteins may not match completely. The number of proteins showing a better correlation is higher in CHARMM-AMBER combinations than the other two. The structural features from trajectories have been computed in terms of fraction of secondary structure, hydrogen bonds, salt bridges and native contacts. Although secondary structures and native contacts are well preserved during the simulations, the tertiary interactions (hydrogen bonds) are lost in many proteins and may be responsible for the difference in the some of properties among forcefields. Comparison of simulation results to experimental structures in terms of Root mean square fluctuations, Accessible surface area and radius of gyration indicates that the simulations results are on par with the ones derived from experimental structures. We have tried to assess the flexibility in the proteins using normalized Root mean square fluctuations (nRMSF), which for a residue is the ratio of RMSF from simulation to that of crystal structure. We have selected a threshold for this nRMSF to indicate the mobile regions in a protein based on secondary structure analysis. Based on the threshold of nRMSF and conformational properties (deviation in the dihedral angles), we have classified the residue and evaluated the properties of rigid hinge residues and corresponding mobile residues in terms of residue propensity, secondary structure preference and accessible surface area ranges. Since the rigid dynamic residues represent the inherent mobility, they might be important for function. Therefore, we have tried to assess the functional relevance considering the dynamic mobile residues from each protein from each forcefield simulation with the residues important for the function (taken from literature and databases). It is observed that some residues found to be mobile from the simulation are found to match with the experimental ones, although in many cases the number of these mobile residues is higher compared to the experimental ones. In summary, an analysis of protein simulation trajectories using three forcefields on a set of monomeric protein has shown that the gross structural properties and secondary structures from many proteins remain similar, but there are differences as may be seen from flexibility index. However correlation in parameters from CHARMM and AMBER force field is better compared to other two combinations. The differences seen in some of structural properties may arise mainly due to the loss of few tertiary interactions as indicated by the fraction of native hydrogen bonds and salt bridges. Based on the nRMSF, mobile segments obtained from the simulations were identified, and some of the mobile segments are found to match the functionally important residues from the experimental ones. Our work indicates that there are still some differences in the properties from the simulations, which indicates that care must be exercised when choosing a forcefield, especially assessing the functionally relevant residues from the simulations.
Showing items related by title, author, creator and subject.
Swapna, L S (2014-05-27)The last few decades have witnessed an upsurge in the availability of large-scale data on genomes and genome-scale information. The development of methods to understand the trends and patterns from large scale data promised ...
Mitra, Pralay (2011-02-14)Molecular interaction among proteins drives the cellular processes through the formation of complexes that perform the requisite biochemical function. While some of the complexes are obligate (i.e., they fold together while ...
Brinda, K V (2011-10-24)