To perform molecular dynamics (MD) simulations, it is first necessary to select a model that can best address the scientific question at hand. The choice of an appropriate model may pose a challenge for new users, as models may vary both in resolution (e.g. course grained, or all-atom) and in energetic detail. Hence, user-friendly guides that can summarize model capabilities and limitations are valuable tools to increase the accessibility of molecular modeling and simulation techniques.
In our recently published book chapter, we address the key features of the SMOG 2 package (Noel et al., PLOS Comput. Biol., 2016), a convenient interface that generates structure-based models (SBM) for studying biomolecular structure-function relationships. Our step-by-step guide introduces new users to the process of preparing and simulating a SMOG model. To allow flexibility when designing SBMs, we developed the SMOG2 software package that uses XML-formatted templates to define model features. These templates contain various model definitions, such as resolution and interaction types, and are used by SMOG2 to produce the structure-based forcefield. The SMOG2-generated files can then be used by a number of popular MD engines (such as Gromacs, NAMD and openMM). The templates allow users to easily change model resolution, introduce charges on specific atoms, introduce new or modified residues as well as to modify interaction types between different residues or atoms. With this approach it is possible to make simple modifications that do not require coding in order to create a variety of SBMs. Additionally, users can make their templates available themselves or through our website http://smog-server.org to minimize the effort needed to reproduce and expand on the collective knowledge of the community.
SMOG models have their roots in a class of models called “Go- models” that have been applied to the study of protein folding. The models are “structure-based” in the sense that their potential energy is determined based on experimentally-obtained structures. The approach stems from the Principle of Minimal Frustration, as applied initially to protein folding. According to this notion, interactions that dominate the energy landscape of protein folding are those that form in the native/folded conformation. Hence SMOG models typically are built to have potential energy minima that correspond to these configurations.
The successful application of SMOG models to study protein folding inspired the expansion of these models to explore other biological processes. Later SMOG models were developed to all-atom resolution and allowed to study the role of non-specific interactions. For example, how structural considerations (i.e. excluded volume interactions) can dictate dynamics or the characterization of RNA-RNA electrostatics in the ribosome as well as investigating protein-DNA interactions. The versatility of SMOG models provides the opportunity to explore a wide range of molecular systems and to pinpoint the key principles that guide biological dynamics.
One particular example of how SMOG models were used to study molecular assemblies is to characterize subunit rotation in the ribosome (Levi et al., Biophys. J., 2017). By utilizing a coarse-grained SMOG model of the ribosome, it was possible to rationalize apparently contradictory observations from single-molecule FRET measurements using different labeling sites. The SMOG model was designed to allow for spontaneous rotation to occur, while being consistent with known measures of molecular flexibility. This design demonstrated how the apparent dynamics along different probes can strongly depend on molecular flexibility. Additionally, the same model was also useful in the exploration of ideal reaction coordinates that were less dependent on molecular flexibility and more correlated with subunit rotation (Levi & Whitford, JPCB, 2019). The insights from this study provide the foundation for the design of next-generation experimental techniques that aim to probe the free energy landscape of the ribosome.
SMOG2 also provides additional tools to ease the preparation of a simulation. We walk the user through the preprocessing (using the “smog_adjustPDB” tool) of a small protein example before using SMOG2. Then, generating a SMOG forcefield and simulating the protein using Gromacs 5. Additionally, we show how it is possible to truncate a subset of the system using the “smog_extract” tool and discuss how to avoid artificial boundary effects that may arise in such truncated systems.
While the default SMOG2 templates give the common definitions of amino acids and nucleic acids, many applications require the inclusion of modified or completely new residue types. For this purpose, we provide a detailed example for extending the templates to include those non- standard residues. In addition, we describe how one may also assign charges within the SMOG2 framework and discuss how to model monovalent and divalent ions using various strategies.
Finally, our chapter includes some troubleshooting and implementation tips. Examples of common preprocessing errors are provided as well as tips to determine simulation box size, scaling, reduced units and temperature calibration. Using these guidelines can serve as an initial assessment for common simulations.
Computational investigations with SMOG models can provide many insights into the relationship between molecular structure, energetics and dynamics, making them increasingly useful to address a wide range of physical questions. By using our guide, we aim to provide an entry-level protocol to develop and use SMOG models.