Our June issue, which went live online yesterday, includes an Analysis paper describing the results of a large-scale study to try to get to the root causes of irreproducibility in mass spectrometry-based proteomics. Despite the novel and valuable biological applications made possible by proteomics and the continuing impressive technological advances in mass spectrometry, the technology has been unable to completely shed its reputation of being poorly reproducible.
To attempt to pinpoint the sources of irreproducibility, John Bergeron and colleagues, as part of a Human Proteome Organization (HUPO) effort, sent a test sample consisting of 20 purified proteins at equal concentrations to 27 different proteomics labs. The study designers asked these labs to identify the 20 proteins by whatever mass spectrometry instrumentation and workflows they were used to using. Initially, only 7 labs correctly reported all 20 proteins! However, when the study designers re-analyzed the data from the labs that failed in the task, they found that almost all actually did have mass spectra for all 20 proteins in hand. Most of the problems therefore stemmed from the database searching approaches used to go from the raw spectra to a protein identification. Many of the labs also reported ‘false positives’ – proteins that were not actually in the test samples. However, it turned out that many of these false positives were real; they were contaminants introduced during the sample handling process.
This study reaches several interesting conclusions. First of all, and reassuringly, the authors found that the mass spectrometry technology itself is reproducible. However, because of the number of complicated steps required to go from an unknown sample to a protein identification, the success of each of the groups varied widely, demonstrating the need for careful sample handling and proper training. The authors also state that improvements in database search engines and the proteomic databases themselves are direly needed.
This work also shows the value of examining the reproducibility of new technologies and methods on a large scale, especially between labs, using carefully prepared test samples. These studies can be expensive and time-consuming, but they are highly beneficial. Broad guidelines for Analysis papers are provided in our April 2008 editorial, and authors interested in submitting such studies are encouraged to contact the editors beforehand.