Three Gurus of Proteomics

Christian Huber: As in all analytical areas, coverage and reliable quantification are major issues for proteomics. It is currently impossible to cover the complete proteome, either qualitatively or quantitatively, due to the very large number of proteins present in proteomics samples and their wide dynamic range. The result is that there are potential blind spots involving proteins that are of real interest. Moreover, it is difficult in repeated proteome measurements to always detect the same proteins, so finding the important parameter in all samples is not a straightforward task.

Another issue is obtaining and preparing representative biological/medical samples. The observed differences in proteomic measurements may be due to biological or individual dissimilarities rather than real differences in the biological processes under observation.

These problems will likely be addressed by the very rapid ongoing development of methods and instruments. Nevertheless, we may continue to employ a toolbox of different approaches to address proteomic samples of high informational content and, unfortunately, this significantly increases cost and effort. What is needed is complete new workflows, devised by teams that integrate expertise. These workflows should start with the selection of samples that represent the biological state to be measured; continue with experimental design, sample preparation, identification and quantification measurements, and bioinformatic data analysis; and finish with biological interpretation of the data.

The biggest challenge of all, in my view, is the treatment of the data. In a typical experiment, information has to be extracted from tens of thousands of mass spectra; to do this, we rely on appropriate software tools that do the job automatically. But this automation has a risk of identifying artifacts, which can lead to false conclusions. In addition, the translation of analytical data into biological information remains difficult and can only be addressed through interaction among analytical chemists, bioinformaticians, biologists and clinicians.

John Yates: Some of the issues that Christian raises are being addressed, others less so. Actually, I feel that proteomics is pretty well evolved at this point. It is having (and has had) a huge impact on the biological sciences. However, most of that happens behind the scenes because the “front page” stories are all about the discovery and not how it was accomplished. Certainly, bottlenecks persist at each of the stages of proteomics – sample preparation, separations, mass spectrometry (MS) analysis, and data analysis – but that is not to say that these issues are serious impediments, rather that improvements in these areas will advance the state of the art. Funding is an issue. As I see it, when funders are challenged by decreases in their budgets it affects their funding priorities, which adds a burden to proteomics research. In tight money times, the focus is on the core mission; it requires the strong argument that advancing technology is key to a better understanding of a disease to get a technology-oriented grant funded. Industrial research efforts often follow the money (or where they think the money will be) rather than identifying the technical limitations and challenges, and solving those. Mass spectrometry (MS) manufacturers, however, have been fairly strategic in their instrument efforts over the last decade and, as a result, they have driven many of the advances in proteomics. Finally, I should add that we certainly need better methods for top-down proteomics (the analysis of intact proteins), which remains a huge challenge with many technical issues.

Barry Karger: It is remarkable how much the field of proteomics has advanced in the past five years. At the same time, it is clear that we are limited in the problems that we can solve. In bottom-up proteomics, for example, we measure peptides, not proteins. Let’s say we quantitate five or more peptides: the result we get will be an average of all the proteoforms of that protein; an average of the post translational modifications (PTMs), mutations, splice variants, and clipped forms. But it may well be that only specific forms are functionally active. On top of this, even if modified peptides, for example, phosphorylated peptides, are studied, in cases where there are multiple sites of modification, we don’t know what constitutes specific protein molecules; only specific forms may be functionally active. Thus, one major issue is to analyze intact molecules (top-down proteomics) using liquid chromatography (LC)- or capillary electrophoresis (CE)-MS. Full quantitative analysis of all proteoforms of a protein is not currently available, except for low molecular, simple proteins; improvements in MS instrumentation, fragmentation processes, data analysis and separation are required. Since this is an area of current interest in the field, we can certainly anticipate advances over the next few years. A second major challenge is refining proteomic analysis of trace levels of biological samples down to hundreds or tens of cells and, ultimately, down to single cells. There are many instances where this capacity would be a crucial advance: to analyze cells following a needle biopsy, for instance, or in the analysis of stem cells or circulating tumor cells where vanishingly small numbers of cells can be obtained. Here, integration of effective capture of the cells, sample preparation with minimal sample loss, narrow bore LC with sub 25 nL/min flow rates, high sensitivity/resolution MS, and data processing are required. As with top-down MS, work is on-going in ultra-trace analysis, but further advances are necessary.

CH: Because of the complexity of proteomic samples, separation is key in proteome analysis. Two-dimensional gel electrophoresis is the gold standard, although chromatographic technologies and multidimensional combinations thereof represent real alternatives in terms of separation selectivity and peak capacity. MS will continue to be the key technology for the high-throughput identification of protein peptides. Here, the newest technologies, including (ultra-)high-resolution Orbitrap or time-of-flight mass analyzers, usually implemented in hybrid instruments, are enabling protein identification at unprecedented speed, sensitivity, and dynamic range. However, as I mentioned earlier, incomplete proteome coverage will remain the most challenging issue, and can only be resolved by a combination of greater resolving power of the separation methods; higher scanning speed and lower detection limits of mass spectrometers; and increased identification success rates for database-searching algorithms. JY: Big advances in proteomics have come from improvements in MS and LC. Faster scanning and more sensitive instruments have worked well with ultra high pressure LC (UPLC), for instance. Data analysis is pretty robust, but the bottleneck is primarily in translating data into knowledge, so interfacing the results of proteomic experiments to the other knowledge sources available on the Internet will be key to solving this problem. I expect smaller gains will come from better sample preparation methods, but these will be necessary for proteomics to be truly comprehensive.

BK: I would not use the term mature; there are, however, some areas that are more advanced than others. An example is the use of two dimensional LC separations coupled to MS, which has been published many times. Some state-of-the-art labs can identify and quantitate close to 10,000 proteins with this approach, and many can do the same for at least 4,000 – but only if enough sample is available. Data processing is becoming easier and better with the emergence of spectral libraries. Faster mass spectrometers will also increase the number of peptides and proteins that can be identified. A second advanced area is multiple reaction monitoring (MRM) for quantitation of specific peptides, and thus of proteins. Triple quadrupoles continue to improve sensitivity and the availability of isotopically-labeled standards mean that the approach is widely used. CH: I believe that all areas are still in rapid development and that none can be regarded as mature. The more classical approaches of high-throughput shotgun protein identification and differential quantification are now being complemented by high-throughput targeted and untargeted absolute quantification using selected reaction monitoring or data-independent fragmentation. Proteomic analysis that includes or targets posttranslational modifications, such as phosphorylation, N-terminal acetylation, or glycosylation, are gaining importance.

BK: Sometimes I feel that researchers focusing on separations and those studying mass spectrometry are in two different universes. Many in the separations field consider only the separation system – the column type, optimized LC or CE conditions, and so on. Their studies often only involve UV or fluorescence detection. In parallel, MS specialists focus mainly on instrumentation or data processing and consider the separation system to be a plug-in device. Since our goal in proteomics is to solve a problem, we need to view the whole platform from an integrated approach. For example, the flow rate from the LC or CE column can dramatically affect the ionization efficiency of electrospray, and thus the signal. If high sensitivity is required, then low flow rates are important. On the other hand, the duty cycle of the MS must be considered along with the speed of separation. Beyond separation-MS coupling, sample preparation ahead of injection must be integrated into the analytical scheme. Sample preparation can so often be the Achilles’ heel of
the analysis. Of course, optimized separation systems will be different for bottom-up or middle/top-down analysis. While chromatographic columns are excellent, and getting better, for peptide (bottom-up) separations, the need for improved separation of intact proteins in complex mixtures is all too clear. Perhaps CE-MS will turn out to be an effective tool, given that the separation efficiency and recovery from the open tube column should be high. Even for individual proteins, the number of proteoforms can be in the hundreds, and some of them will be isobaric. If one pulls down a protein complex, the numbers will again be high. Some researchers are attempting top-down analysis of full proteomes. Complexity here will be extreme, so the separation challenges are huge in the top-down arena. Multi-dimensional separations, along with ion mobility separations, will be necessary. In fact, top-down proteomics offers major challenges and, therefore, opportunities for separation science.

CH: The primary goal of sufficient separation of all compounds in the sample mixture remains valid for proteome analysis. However, sample complexity in proteome analysis is usually so high that a full separation is impossible; MS must aid in deconvoluting sample complexity. Increasing separation efficiency and peak capacity are high priority goals. Since ionization, especially electrospray, is influenced by sample complexity, I cannot imagine successful proteome analysis workflows without an integrated separation step.

CH: MS will remain the priority platform for protein identification. However, I can foresee a shift away from the peptide-oriented methods to top-down proteomics. JY: MS has been a central driving force in proteomics for two decades. Without the technological advances over the last decade (for example, Orbitrap) proteomics would not have moved as rapidly as it has. Areas where mass spectrometers have improved and need to continue to improve are scan speed, sensitivity, and dynamic range. Mass resolution and mass accuracy have probably gone as far as they need to go for effective peptide analysis. BK: Most workers would agree that operating an MS system is very complicated: it requires expertise in the selection of MS conditions, the variety of fragmentation approaches, the interpretation of the data (both manual and automatic), quantitation, and reproducibility. At leading proteomics laboratories, an entire team of specialists gets involved. Add to this the fact that at each American Society for Mass Spectrometry (ASMS) meeting, new instrumentation is introduced that changes the performance to such a dramatic extent that one continually feels the need to purchase new equipment. And yet, it is obvious that MS must be simplified in the near term as broader applications emerge, for example, biomarker quantitation for the clinical lab, protein quantitation as a substitute for Western blotting, and lot release in biotechnology. It is interesting that the recent Human Proteome Organization (HUPO) meeting in Japan also emphasized this point. There has been some effort in simplification of the MS front end, such as microfluidic devices for separation that feature plug-and-play designs, but mass spectrometers themselves are, if anything, getting more complex. I hope that the instrument manufacturers will devote greater resources to the difficult task of simplifying MS, to allow non-specialists to operate the instruments on their own.

BK: Ion mobility will be used in the complex mixture analyses of proteomics: we will undoubtedly see higher resolution ion mobility systems and improved sensitivity. Note that one-dimensional LC-ion mobility-MS is actually a three-dimensional separation system – all on-line. One or more off-line separations could be added to this system as required. The number of separation steps will depend on the complexity of separation and the speed (duty cycle) of the MS. We will see more on-the-fly MS with feedback to select particular ions for further study; at the moment we are in the early stages of this kind of analysis. From a data analysis point of view, we need new bioinformatic approaches to combine the proteomic data we generate with other omics data, particularly genomic, but also, metabolomic and lipidomic. The integration of all omics information will provide a much more detailed picture of biological processes, a necessary advance for better understanding specific diseases. We must always remember that proteomics, while essential, is but one piece of the puzzle. The years ahead will be exciting ones. The whole community will play a role in the many forthcoming advances.

JY: Personally, I’m watching one method and working with another that has promise… I think CE has great potential for the analysis of intact proteins with MS. There are still some technical issues, but I’ve been impressed with separations and detection limits. A method I’m watching carefully is UV photodissociation (PD) of peptides and proteins. Progress in this area has been staggering. When I was in graduate school, the Hunt laboratory (people.virginia.edu/~dfh/) was trying UV PD on a fourier transform-MS instrument and it was producing awful results. I didn’t think it would go anywhere. However, recent progress in the Brodbelt (brodbelt.cm.utexas.edu/research), Reilly (reilly.chem.indiana.edu), and Julian (www.faculty.ucr.edu/~ryanj/) laboratories have gotten me excited about the technique – looks like it has a great future. CH: Using digested proteins, a lot of analytical information is lost because of the incomplete sequence coverage. A dramatic shift towards intact protein analysis is inevitable – after all, the intact molecules are the ones providing biological function. Proteins are more individual than peptides, so the analytical methods being developed are moving away from the more generic shotgun approaches that are applicable to peptide analysis. I believe that more effort will be required to tune and optimize these methods for specific proteomic problems.

About the Author(s)

Christian Huber

Fascinated by the diversity of biological macromolecules, Christian Huber strived to find biomedical applications for the analytical methods he developed. After becoming chair of analytical chemistry at Saarland University, he moved into proteome analysis. More recently, he joined the University of Salzburg’s molecular biology department in Austria. “It is fascinating how state-of-the art analytical technologies, most of all chromatography and mass spectrometry, can be used to study the molecular biology of cancer, nanotoxicity, or drug integrity,” he says.

Barry Karger

Barry Karger is Director of the Barnett Institute of Chemical and Biological Analysis and James L. Waters chair in analytical chemistry, Northeastern University, Massachusetts, USA. “I find science exciting and I am interested in not only analytical chemistry but the problems it must address”. Barry integrates various technologies into separations and detection, especially LC- and CE-MS. “I try to understand what’s important in the areas where we apply our technology, for example, biotechnology or clinical research.”

John Yates

John Yates is Ernest W. Hahn Professor of Chemical Physiology and Molecular and Cellular Neurobiology at Scripps Research, LaJolla, California, USA. He was recently named Editor of the Journal of Proteome Research.

Three Gurus of Proteomics

About the Author(s)

Christian Huber

Barry Karger

John Yates

Recommended

This Week’s Mass Spec News

What If Computers Could Smell?

The Analytical Scientist Innovation Awards 2024: #6

The Analytical Scientist Innovation Awards 2024: #4

Explore

Featured Topics

Issues

Techniques & Tools

Applications & Fields

People & Profiles

Business & Education

Three Gurus of Proteomics

Newsletters

About the Author(s)

Christian Huber

Barry Karger

John Yates

Recommended

Related Content

This Week’s Mass Spec News

What If Computers Could Smell?

The Analytical Scientist Innovation Awards 2024: #6

The Analytical Scientist Innovation Awards 2024: #4