The Road to HPLC2018 Part IV: The Grand Challenge of Whole Proteins
The intact protein separations crucial for top-down mass spectrometry continue to thwart chromatographers – will we see progress in 2018?
John E. Wiktorowicz, Neil Kelleher |
The essential questions of how and to what purpose we investigate proteomics are becoming ever more pressing as we unravel the complexity of the human genome and its relationship to the proteome. Recent analyses suggest that the human genome contains roughly 20,700 genes (1). However, the complexity of the human proteome also reflects multiple splice variants (2), which yield an estimated 205,000 protein-coding transcripts (3). With over 400 different types of post-translational modifications currently known, and without even contemplating the vast combinatorial universe that implies, there are at least 1,000,000 distinct protein forms within a given human cell (4).
The huge diversity of proteoforms and their post-translational modifications leads us to question what their functional role is. Could their dysfunction underlie many human diseases? Essentially, the entire spectrum of human cellular biology can be traced to protein-level post-translational modifications, but compared with the genome, the proteome is dramatically under-mapped.
Post-translational modifications can decorate proteins at multiple sites simultaneously – the resulting overwhelming complexity presents an unmet challenge for current proteomic approaches. It’s not only the sheer number of proteoforms we have to get to grips with – post-translational modifications alter protein properties such as size, net charge, and hydrophobic behavior, all of which are exploited to achieve separations – but also their impact on mass spectrometric sensitivity and peak capacity.
Among the new tools being developed, top-down mass spectrometry (TDMS) analyzes intact proteins, and in doing so is best suited to preserve the complexity of multiply modified proteoforms. However, intact proteins and their complexes from 10 kDa to >5 MDa present significant challenges to TDMS runs in both denaturing and native modes. As a consequence, deep proteomics by TDMS is extraordinarily reliant on highly resolving protein separations, their peak capacities, and their ability to quantitatively recover the proteins that are applied. Moreover, most separation systems historically used for conventional proteomic MS are better suited to separations of peptides, not intact proteins. Hence, there is a growing demand for efficient, rapid, and quantitative separations for intact proteins and TDMS.
No single separation technology can address the demands of proteomics on selectivity, orthogonality, and quantitative recovery. Clearly, systems that employ multiple, orthogonal dimensions, with as little protein loss as possible, will be most successful. Ideally, these would involve liquid-based, non-adsorptive (requiring no elution) formats, exploiting isoelectric point (pI), molecular size (radius of gyration), hydrophobicity, hydrophilicity, and ultimately mass by MS.
Many approaches have been devised, exploiting one or more of the parameters described above, either solely or in combination (for recent reviews, see (5)(6)(7)); however, none have demonstrated their ability to attack the complexity of the cellular proteoform.
The separation science community declared whole proteins a “Grand Challenge” at the HPLC conference in 2012. As such, increased innovation and creativity are being brought to bear from both academia and industry in this growing sub-sector of proteomics. This effort will be critical for both genomics and proteomics to achieve their full potential in addressing the central issues in human health, and to realize the promise of precision medicine.
- C Southan, “Last rolls of the yoyo: Assessing the human canonical protein count”, F1000Res 6, 448 (2017).
- ML Uhlen et al., “Proteomics. Tissue-based map of the human proteome” Science 347, 1260419 (2015).
- Z Hu et al., “Revealing missing human protein isoforms based on ab initio prediction, RNA-seq and proteomics”, Sci Rep, 5, 10940 (2015).
- LM Smith, NL Kelleher, & The Consortium for Top Down Proteomics, “Proteoform: a single term describing protein complexity”, Nat Methods 10, 186–187 (2013).
- S Stepanova & V Kasicka, “Analysis of proteins and peptides by electromigration methods in microchips”, J Sep Sci 40, 228–250 (2017).
- KK Tetala & MA Vijayalakshmi, “A review on recent developments for biomolecule separation at analytical scale using microfluidic devices”, Anal Chim Acta 906, 7–21 (2016).
- TK Toby, L Fornelli & NL Kelleher, “Progress in top-down proteomics and the analysis of proteoforms”, Annu Rev Anal Chem (Palo Alto Calif), 9, 499–519 (2016).
John E. Wiktorowicz is a Professor at the University of Texas Medical Branch at Galveston, Texas, USA.
Neil Kelleher is Professor, Northwestern University, Evanston, Illinois, USA.