Thumbnail Image


Intrinsically disordered proteins (IDPs) are crucial in biology and human diseases, necessitating a comprehensive understanding of their structure, dynamics, and interactions. Atomistic simulations have emerged as a key tool for unraveling the molecular intricacies and establishing mechanistic insights into how these proteins facilitate diverse biological functions. However, achieving accurate simulations requires both an appropriate protein force field capable of describing the energy landscape of functionally relevant IDP conformations and sufficient conformational sampling to capture the free energy landscape of IDP dynamics. These factors are fundamental in comprehending potential IDP structures, dynamics, and interactions. I first conducted explicit solvent simulations to assess the performance of two state-of-the-art protein force fields, namely CHARMM36m and a99SB-disp, in capturing the stability of small protein-protein interactions. To evaluate their accuracy, I selected a set of 46 amino acid backbone and side chain pairs with representative configurations and computed the free energy profiles of their interactions. The results demonstrated that CHARMM36m consistently predicted stronger protein-protein interactions compared to a99SB-disp. Notably, the most significant overestimation in CHARMM36m occurred in charged pairs involving Arg and Glu side chains, with an overestimation of up to 2.9 kcal/mol. Through free energy decomposition analysis, I determined that these overestimations were primarily driven by protein-water electrostatic interactions rather than van der Waals (vdW) interactions. Consequently, these findings suggest that careful rebalancing of electrostatic interactions should be considered in the further optimization of protein force fields. In order to enhance the conformational sampling of IDPs, I developed an integrated approach that combines an improved implicit solvent model called Generalized Born with molecular volume and solvent accessible surface area (GBMV2/SA) with a multiscale enhanced sampling (MSES) technique. To make this approach more efficient, I implemented it as a standalone OpenMM plugin on Graphics Processing Units (GPUs). The results demonstrated that the GPU-GBMV2/SA model achieved numerical equivalence to the original CPU-GBMV2/SA models, while providing a remarkable ~60x speedup on a single NVIDIA TITAN X (Pascal) graphics card for molecular dynamic simulations of both folded and unstructured proteins. This significant acceleration greatly facilitated the application of the approach in biomolecular simulations. In addition, I conducted an evaluation of the reliability of GBMV2/SA models in simulating both folded and unfolded proteins. The results revealed that the GBMV2/SA model accurately describes small proteins, but its applicability is limited when it comes to larger proteins such as KID and p53-TAD proteins. This limitation can be attributed to the absence of long-range solute-solvent dispersion interactions in the model. To address this issue, I introduced a comprehensive treatment of nonpolar solvation free energy called GBMV2/NP model. Unfortunately, the GBMV2/NP model exhibited a destabilizing effect on well-folded proteins, particularly larger ones, due to an inaccurate representation of the repulsive solvent accessible surface area (SASA) model caused by the utilization of unphysical van der Waals volume. This observation highlights the need for further improvements in accurately describing the nonpolar term in the model.