Conquering the Human Proteome
The Human Proteoform Project needs a US$1.3 billion moonshot to transform our understanding of protein-based disease
Margot Lespade | | 7 min read | Interview
Neil Kelleher has previously discussed the many merits of top-down proteomics and how they will enable “the earlier and more precise detection of all human disease.” He is currently the Walter and Mary Elizabeth Glass Professor of Molecular Biosciences, Professor of Chemistry, and Professor of Medicine (Hematology & Oncology) at Northwestern University, Evanston, Illinois, USA. He is also the Director of the Chemistry of Life Processes Institute at Northwestern and Director of Northwestern Proteomics. Here, he answers our questions about challenges facing the proteomics field, the Human Proteoform Project and which areas will benefit most from this project.
Tell us a little about the current state of proteomics…
First, I think it’s important to mention the Human Genome Project – a great collaboration between labs across the world that resulted in the mapping and sequencing of all the genes in the human genome. Essentially, it allowed us to decode the instructions for life in the world of DNA.
However, in the world of proteins – for reasons I’ll go into later – we haven’t quite reached that level. The biggest difference between these two worlds is that one has had a “moonshot” project, whereas the other has not. In my opinion, the time has come for us to focus on protein biology. I hope that, within the next eight to 12 years, the ability, scale, and economic efficiency of our technology will meet the challenge our biology presents at the protein level.
What is the most exciting thing happening in proteomics today?
Single-cell proteomics is definitely up there! It means you can perform analysis via direct cell imaging. You can take an organ or a part of a tissue at microscopic levels, image it at single-cell resolution, and examine the proteins found there. It’s now possible to interrogate dozens of different proteins using antibody-based imaging technologies, so that’s one really cool development!
Other exciting things that come to mind include the Human Protein Atlas, which is a Sweden-based program that aims to map all human proteins, and the US$600 million Human Biomolecular Atlas Program (HuBMAP) out of the National Institutes of Health. HuBMAP is now approaching the mapping of 30 tissues in the human body at single-cell resolution. What makes this so exciting? It means we’re only a few years away from being over the hump and on the downward slide toward completion!
Single-cell proteoform analysis is also becoming possible in a few labs. Notice the wording here – proteoforms, not proteins! That is also going to become very important, because we are getting another level of molecular precision that will help to bring order to the world of human proteins.
What major challenges is the proteomics field facing?
I would say one of the major issues we face at the moment is funding. Essentially, we want to devise technologies that put genomics and proteomics on par with one another. In genomics, we can now do easy, inexpensive single-molecule DNA sequencing – and that’s because of the “moonshot” Human Genome Project. That is the level of investment we need in proteomics.
There has been a big push in the private sector in the last two years, with a particular focus on investment in single-molecule proteoform sequencing (SMPS). In fact, a few billion dollars have been promised over the next couple of years. The big question is – will there be a breakthrough in protein sequencing or single-molecule proteoform analysis in the next few years? Right now, there are 10 to 12 companies that stood up with significant or early-stage venture capital investment, many of whom have come out of the genomics world wanting to nail down the world of proteins – so I think the future looks promising.
Tell us about the Human Proteome Project and the Human Proteoform Project. How do they differ?
The variety of words – proteins, proteome, proteomics, proteoforms – can be challenging, but there is an important distinction between these projects.
The Human Proteome Project was conceived all the way back in 2002 and found a home in the Human Proteome Organization (HUPO), which is very similar to the Human Genome Organization. HUPO has articulated two major versions of the Human Proteome Project over the years since. The first was rearticulated in 2010 and relaunched aggressively in 2012. Then, two groups with back-to-back Nature papers in 2014 did a large amount of that project, covering about three-quarters of all human proteins using bottom-up proteomics with peptides as the unit of measurement. In some sense, the Human Proteome Project is a little out of date today as, essentially, a low-resolution draft of the human proteome.
On the other hand, the Human Proteoform Project is more specific and uses top-down proteomics with different measurement philosophies and units of measurement. The project states that we must systematically discover proteoforms as they exist in all human cell types and body fluids, providing complete and absolute molecular specificity. The aim is to create a map of the human proteome at proteoform-level specificity.
What is your role within the Human Proteoform Project?
Well, you know when you have a wedding and the dance floor is open… I was one of the first people to start dancing. You could say my role is chief instigator! I’m also the founding president of the Consortium for Top-Down Proteomics.
Where is the project at the moment?
The Human Proteoform Project is not currently funded. At the base, we would probably need about $1.3 billion over 10 years; $130 million a year would allow us to begin aggressively and then bring in disruptive technologies later. There are three phases to the project and we haven’t even launched the first. But we’ve articulated the project and we’re actively seeking a worldwide consortium of like-minded scientists to participate.
Our paper framing the project came out in November 2021, making the science case and outlining the general terms (1). At its core, this project is a foundational project and the obvious next step after the Human Genome and Proteome Projects. We have the low-resolution draft; all we need now is the complete, high-resolution version!
Thankfully, receptivity today is quite different from what it was 10 to 15 years ago. A lot of investors in the private sector will realize that we can sequence proteins and, most importantly, they are going to begin to understand that our biology is proteoform-based. However, the Consortium of Top-Down Proteomics’ position is that this is a government play, just like it was for the Genome Project. It can absolutely be funded through established agency frameworks. The government can then be a catalyst to provide the definitive human reference proteome – just like it did in 2002 with the reference genome – and then everybody can feast on that information. That includes patients and advocacy groups, as well as companies and academic institutions discovering disease mechanisms and biomarkers. The project would vastly improve efficiency for basic and translational biomedical research.
So that’s the case for spending $130 million a year. The arguments in favor have always existed, but I think it’s important to think about receptivity, timing, the arc of history, and when the technology can be put in place to do it.
Is there a particular application area that you think might benefit most from the Human Proteoform Project?
Proteins and proteoforms are involved in all human disease; there is not a single human disease that does not involve proteins. So it’s that fundamental.
That said, I think neurodegeneration will be a key area to benefit from this project. We have yet to apply the planet’s best technologies to protein-based diseases such as Alzheimer’s, Parkinson’s, amyotrophic lateral sclerosis (ALS), upper motor neuron disease, and frontotemporal dementia. All of these disorders involve proteins going bad.
When we eventually have the right technology, regenerative medicine will also benefit majorly from the Human Proteoform Project. Unfortunately, we don’t currently have the tools to meet the scale of the analytical challenge. So, to me, the question is – how are we going to get there? Are we going to do it slowly? Or are we going to invest in a moonshot to get it done in the next decade?
- LM Smith et al., “The human proteoform project: defining the human proteome,” Sci Adv, 7, eabk0734 (2021). DOI: 10.1126/sciadv.abk0734.