By Isabell Bludau and Ruedi Aebersold:
Protein complexes are the molecular machines that perform and orchestrate almost all catalytic, structural and regulatory processes of the cell. To gain a deeper understanding of a cell’s functional landscape, it is therefore of major interest to detect and quantify protein complexes on a proteome-wide scale. Until recently, most research in the field focused on the goal to discover and characterise novel protein-protein interactions and complexes in a given experiment. In our recent work, we changed the perspective to rather ask the question:
Out of all the protein complexes we already know exist, which ones are present in the sample and how do they change between samples?
We termed this approach complex-centric proteome profiling. This blog post focusses on the targeted, complex-centric analysis concepts and how it builds upon prior notions in the larger field of proteomics.
The most common strategy to comprehensively study the cellular proteome is called bottom-up proteomics, where proteins are first digested into smaller peptides which are subsequently analysed by liquid chromatography coupled to tandem mass spectrometry (LC-MS). Traditionally, data is measured in data-dependent acquisition (DDA) mode, meaning that the top-N highest abundant peptides at a given chromatographic elution time point are selected for fragmentation and spectrum acquisition. The identities of the selected peptides are subsequently determined by searching the acquired spectrum against a database containing all theoretical spectra of the analysed species. While this approach has been successfully applied in many biological and clinical studies, it has the major drawback that we basically need to identify the same spectrum again and again in each separate sample. Due to variable sample quality, machine performance and the heuristic of peptide selection, this leads to high numbers of missing values in heterogeneous datasets and results in a scenario where the focus is often lying on identification rather than quantification.
An alternative data acquisition strategy is to concurrently fragment multiple peptides, thus generating highly multiplexed spectra in which a standard database search becomes increasingly difficult. In 2012, the Aebersold lab introduced a new combined data acquisition and analysis concept, called SWATH-MS, with the core idea of initially concurrently fragmenting multiple peptides in a given mass range, followed by a targeted analysis step to detect and quantify peptides in the highly multiplexed spectra by using prior knowledge about peptide characteristics observed in previous DDA experiments. Specifically, the data analysis strategy uses the chromatographic dimension by assuming the fragment ions derived from the same precursor show precisely overlapping chromatographic traces (Figure 1) (1). Today, the concept is more broadly known as data-independent acquisition (DIA) coupled to targeted, peptide-centric data analysis using spectral libraries (2). Over the last years, this approach became increasingly popular and was demonstrated to provide more consistent peptide and protein detection and quantification across large datasets compared to classical DDA. It is therefore a preferred technique when the proteomes of a large number of samples are being compared.
Now your question might be: Why is this standard proteomics expose important for understanding how we analyse protein complexes? Essentially, complex-centric proteome profiling extends the targeted proteomics rational of SWATH-MS to the level of protein complexes. Instead of aiming to identify novel protein complexes in a given sample, we take advantage of prior knowledge stored in protein complex databases to detect and quantify these complexes in our dataset. The experimental data is based on a highly optimised workflow, combining size-exclusion chromatography (SEC) to separate protein complexes from native cell lysates according to their molecular weight, and SWATH-MS to detect and quantify the peptides and proteins along the SEC dimension. Looking at the resulting SEC-SWATH-MS data structure, you can immediately see the parallels to multiplexed SWATH-MS data (Figure 2). These similarities in data structure also made it possible to extend the use of target/decoy models to estimate false discovery rates from the level of protein to the level of complex identification.
In our primary study, we demonstrated that similar benefits as in targeted DIA/SWATH-MS analysis also apply to the targeted, complex-centric analysis approach of SEC-SWATH-MS data (3). We could confidently detect over 500 protein complexes annotated in the Corum complex database and were able to further investigate sub-complexes and the relative contribution of subunits to different assemblies. In our recent protocol (4), we describe the entire workflow starting from sample preparation by SEC-SWATH-MS all the way to the complex-centric analysis that is implemented in our customised R-package called CCprofiler (https://github.com/CCprofiler/CCprofiler).
We envision that the targeted, complex-centric analysis concept will shift the perspective in the interactomics field from complex identification, to a more quantitative focus on the sensitive, parallel and consistent detection and quantification of protein assemblies across multiple samples. This approach necessitates the availability of high-quality and increasingly comprehensive protein complex databases as a resource of prior knowledge. Future developments will further focus on scaling-up the experimental and computational workflow to enable large comparative studies with a focus on protein complex rearrangements between multiple conditions and potentially also in a clinical context.
- Gillet LC, Navarro P, Tate S, et al. Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol Cell Proteomics. 2012;11(6):O111.016717. https://doi.org/10.1074/mcp.O111.016717
- Ting YS, Egertson JD, Payne SH, et al. Peptide-Centric Proteome Analysis: An Alternative Strategy for the Analysis of Tandem Mass Spectrometry Data. Mol Cell Proteomics. 2015;14(9):2301-2307. https://doi.org/10.1074/mcp.O114.047035
- Heusel M, Bludau I, Rosenberger G, et al. Complex-centric proteome profiling by SEC-SWATH-MS. Mol Syst Biol. 2019;15(1):e8438. Published 2019 Jan 14. https://doi.org/10.15252/msb.20188438
- Bludau I, Heusel M, Frank M, et al. Complex-centric proteome profiling by SEC-SWATH-MS for the parallel detection of hundreds of protein complexes. Nat Protoc. 2020;15(8):2341-2386. https://doi.org/10.1038/s41596-020-0332-6