Single-particle cryo-electron microscopy (cryo-EM) analysis has emerged as a dominant technique in structural biology. Until recently, the predominant imaging technique used to determine protein structures has been x-ray crystallography. However, with advances in hardware (TEMs with stable optics, direct electron detectors) and software (improved data collection capabilities, increasingly powerful and intuitive image processing programs) has allowed cryo-EM to “catch up” and in many cases, surpass, the utility of x-ray for structural studies. This is true especially for the study of large macromolecular assemblies, membrane proteins and proteins that have low expression profiles, are unstable and/or recalcitrant to crystallization. Samples prepared for cryo-EM are placed onto a holey carbon support grid and rapidly frozen in vitreous ice, allowing the proteins to be imaged in multiple conformations in a frozen hydrated state. This allows structures to be determined of proteins and assemblies in their native state, unhindered by the constraints of crystal formation. For these reasons there has been a significant number of structures solved by cryo-EM in recent years, with the projection to steadily increase as new technologies are developed.
Despite these recent successes, there remains significant challenges that have yet to be fully addressed. The quality of the data and resolution limits significantly depend on sample purity and homogeneity. Proteins that are difficult to express and purify through traditional bacterial or eukaryotic expression systems remain a bottleneck that makes their structural determination difficult or impossible to obtain. This is especially true of membrane proteins and membrane-bound protein complexes. Purification of these compounds often requires the optimization of concentration and type of detergent to extract target proteins from the cellular lipid bilayer. These extraction methods often generate samples of insufficient purity and homogeneity for cryo-EM structure analysis, and leave the researcher back at square one.
To that end, we have developed a “Build and Retrieve” (BaR) methodology to solve cryo-EM structures of proteins from heterogeneous impure samples. This method is a bottom-up systems structural atomic approach to simultaneously solve the structures of multiple proteins (both soluble and membrane-associated) from raw samples, including crude membranes and cell lysates, to near atomic resolution1.
After sample preparation and standard cryo-EM imaging and processing, the BaR methodology is used. This includes:
Initial 2D classification: In silico purification of the data set through multiple rounds of 2D classification are performed. All classes with distinct features are used for further processing.
Preliminary 3D classification: The full particle set is sorted using 2D and 3D ab initio classifications to divide the complete particle set into subsets based upon similar features. Each unique subset is further analyzed separately.
Initial map building: Subsets are rigorously cleaned are re-sorted through several rounds of 2D classification. Only those with distinct high resolution features are further processed. Initial maps are solved using non-uniform refinement with C1 symmetry.
Retrieval of full particle set: The built maps generated during initial processing are used as templates for the 3D heterogeneous classification of the initial 2D cleaned particles. Each subset is treated individually for final structural refinement.
The combined data sets are then used for final refinement and protein identification using structural programs such as PHENIX2 and Coot3, with the identities confirmed by native mass spectrometry and/or proteomics. The full procedure, from sample prep to final refinement, allows structural data of multiple proteins and protein complexes to be obtained from crude samples previously thought to be too impure for use. BaR enables the structural determination of multiple small, indistinct and difficult to purify proteins from a single cryo-EM experiment using entirely ab initio methods.
This does not suggest that the purity and abundance of the protein/complex of interest can be ignored. While BaR is able to enrich data sets to help produce high resolution maps, it is still critical to build a quality initial 3D model that allows successful retrieval of additional data from the raw images. Initial classifications are combinations of proteins with similar yet distinct 2D views. A suitable 3D model allows BaR to extract additional data and minimal views that would otherwise be lost due to heterogeneous initial classifications. Therefore, samples that have increased purity and abundance allow BaR to effectively maximize data retrieval to obtain higher resolution structures.
Elucidating protein-protein interaction partners and mapping the network of signal transduction are exciting areas in the field of systems biology. In the future, with continued advances in cryo-EM, it is possible that the BaR method will allow the study of systems structural proteomics, both at the cellular level and from individual tissue specimens. It may one day be possible to detect proteome-wide structural changes of proteins and protein complexes when comparing healthy and pathological patient samples. BaR would help to unravel the mechanistic basis of diseases by looking at differences in protein expression levels, misfolded proteins and disruption of complex formation, all of which could cause abnormal cell signaling, membrane homeostasis and ion and nutrient influx/efflux, paving the way for therapeutic interventions.
- Su, C-C. et al. A ‘Build and Retrieve ‘methodology to simultaneously solve cryo-EM structures of membrane proteins. Nat Methods, 18, 69-75 (2020).
- Adams, P.D. et al. PHENIX: building new software for automated crystallographic structure determination. Acta Crystallogr. D Biol. Crystallogr. 58, 1948-1954 (2002).
- Emsley, P. & Cowtan, K., Coot: model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. 60, 2126-2132 (2004).