Getting to the Heart of the Proteoforms
How do we overcome the challenge of characterizing cardiac proteoforms?
Jonathan James | | Quick Read
Heart disease is the number one global killer – and recent evidence suggests that proteoform analysis is important to understanding cardiac dysfunction (1). MS can be used to characterize the proteoforms found in cardiac disease states, but the approach suffers a number of problems, such as decreasing signal-to-noise ratio as molecular weight rises. Adapting open-source software, researchers at the University of Wisconsin-Madison have applied MS to identify a novel set of large proteoforms in human heart tissue (2). We asked the team (Lloyd Smith, Ying Ge, Leah Schaffer, Trisha Tucholski, and Michael Shortreed) to tell us more about the work.
What are proteoforms – and how do we identify them?
Different protein forms arise from biochemical processes – RNA processing, post-translational modifications and genetic variability being chief among them. Proteoforms can be identified by their intact mass and fragmentation data. A previous study by the Ge lab introduced a top-down proteomics platform using MS-compatible, serial-size exclusion chromatography to fractionate proteins extracted from human heart tissue – enabling a 15-fold increase in proteoform observations over 60 kDa (3). However, no proteoforms above 60 kDa were identified because of the difficulty in obtaining high quality MS/MS data on a chromatographic timescale.
What can you tell us about your new workflow?
We previously pioneered an open-source software program – Proteoform Suite – that is capable of identifying proteoforms by intact mass alone, grouping them into distinct families. In this study, we augmented the software to allow us to determine candidate identifications for large proteoforms based on average mass and LC retention time. The large proteoform candidates that we selected informed our interpretation of previously acquired large proteoform fragmentation data, which – until now – had not provided positive identifications.
We’ve been able to identify a number of important large heart proteoforms – notably, a complex fragmentation spectrum from co-isolation of multiple 72 kDa lamin A and 65 kDa lamin C. We also identified endogenous 140 kDa myosin heavy chain protein C for the first time; proteoforms of this type are associated with various heart diseases.
What main challenges did you face?
We were unable to isotopically resolve the observed proteoforms because of their high molecular weight and the resolving power limitations associated with quadrupole-time-of-flight mass analysis. As a result, we used average mass and a wide search space to determine candidates, but this also generated many false positives and therefore required manual analysis of our data.
Larger proteoforms do not fragment as efficiently as smaller proteoforms, which makes collecting high-quality data challenging. A higher number of fragmentation spectra are required, which is difficult to achieve on an LC timescale. In addition, fragmentation spectra become increasingly complex with larger proteoforms.
We hope to use intact mass analysis to construct proteoform families that will facilitate the selection of interesting candidates for more targeted data acquisition. A number of observed masses remain unidentified – likely, at least in part, due to the enormous number of uncharacterized proteoforms. Future work may also see us integrate other types of data, such as bottom-up peptide or RNA sequencing data, to create a more comprehensive database.
- W Cai et al., “Top-down proteomics: technology advancements and applications to heart diseases”, Expert Rev Proteomics, 13, 711 (2016). DOI: 10.1080/14789450.2016.1209414
- LV Schaffer et al., “Intact-mass analysis facilitating the identification of large human heart proteoforms,” Anal Chem, 91, 10937 (2019). DOI: 10.1021/acs.analchem.9b02343
- W Cai et al., “Top-down proteomics of large proteins up to 223 kDa enabled by serial size exclusion chromatography strategy”, Anal Chem, 10, 5467 (2017). DOI: 10.1021/acs.analchem.7b00380