Subscribe to Newsletter
Techniques & Tools Mass Spectrometry, Clinical, Metabolomics & Lipidomics

MASST 2.0

The Dorrestein Lab introduced the Mass Spectrometry Search Tool (MASST) – a web-enabled mass spectrometry search engine – in 2020. What makes this platform special? It enables users to search a specific tandem mass spectrum, the “fingerprint” of molecules, against spectral data that have been deposited in GNPS/MassIVE – one of the largest public repositories for mass spectrometry data in the world (1). 

But some limitations remain – especially in untargeted metabolomics for microbial analysis. So, the researchers decided to tackle those challenges, while enhancing the performance of MASST, with the development of microbeMASST (2). 

Here, Simone Zuffa, Postdoctoral Scholar at Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, USA, and primary author of the study, discusses the design of their metabolomics platform – and the future of MS-based metabolomics…

What are the current limitations in untargeted metabolomics experiments – specifically when it comes to microbial metabolites?
 

One main limitation of untargeted metabolomics is annotation. Most of the molecules we can capture in a given biological sample are unknown – we have no idea what they are nor do we know what they do. To identify these molecules, we compare their acquired “fingerprints” to specialized reference libraries. Unfortunately, these libraries mainly contain primary metabolites, such as the TCA cycle molecules, or other molecules that can be easily produced and purchased. This approach leaves out thousands or possibly millions of molecules that can only be produced by specific microorganisms – molecules that may play an important role in regulating different ecosystems or human health. 

We knew our original MASST platform could tackle those issues, but we also knew that the engine itself presented some limitations. A single search with the original MASST would take around 20 minutes, making it impossible to programmatically search for hundreds or thousands of MS/MS spectra, and the obtained output was not easily interpretable because it was not associated with metadata. So, we developed microbeMASST – an evolved and domain specific version of MASST.

How exactly does microbeMASST address these challenges?
 

Extracting, purifying, and determining the structure of unknown molecules is a very difficult and long process. People have been working for more than 30 years on a single molecule! Instead of trying to identify molecule by molecule, we have decided to flip the question; is the searched molecule of microbial origin? The answer could hopefully help guide the work of many scientists in different fields.

So, we have curated a reference database acquired exclusively from microbial monocultures. It contains more than 100 million MS/MS spectra acquired from more than 60,000 samples of microorganisms belonging to thousands of different species. 

microbeMASST also provides visual representation of these results in an interactive taxonomic tree – which drastically improves interpretability of the searches. Its reference database is open source – new and relevant deposited data in GNPS/MassIVE will be routinely added to this resource, which will scale up together with the willingness of researchers around the world to publicly share the data acquired from their experiments.

Did you face any particular obstacles during development? If so, how did you overcome them?
 

There are always obstacles during development! But thanks to co-authors Robin Schmid and Anelize Bauermeister and our collaborators, we overcame them. One specific problem was metadata curation – harmonization is always a tedious and hard process, especially with public data. We had to deal with isolates that did not have NCBI IDs and face incongruencies with taxonomic names and IDs. We also had to decide if we should generate a phylogenetic or taxonomic tree; we eventually decided on the latter because most of the samples did not have associated genomic data. 

Coordinating the work of the collaborators was also challenging – we have around 100 authors in the paper who acquired and deposited data for microbeMASST, but some of them were not familiar with the GNPS environment. It was thanks to strong communication and collaboration that we managed to carry out our research as pain free as possible. 

How user friendly is microbeMASST?
 

We believe that microbeMASST is extremely user friendly and we are very proud of it! No training is required – maybe you just need to have an idea of what mass spectrometry is. But that’s it. If you have an MS/MS spectrum you just have to copy-paste it and click search. We also have detailed documentation and several Youtube videos showcasing how to use the tool, so that researchers with different expertise levels and scientific backgrounds can use microbeMASST.

What are some of the potential applications of microbeMASST? 
 

In our recent paper, we showcase that microbeMASST is accurate in identifying producers of medically relevant microbial molecules and that it can be used to search for thousands of MS/MS spectra in just a few hours – something that would have previously taken five months. We are confident microbeMASST can be used to search for any molecules of interest and see if they can be found in different microorganisms, speeding up the process of understanding the types of molecules microbes produce. I would like to highlight that we were also capable of searching from unknown molecules via their MS/MS spectra. No other tools allow you to do this.

But there are also several other applications outside of microbial analysis we did not showcase in the paper. One example is strengthening multi-omics observations. Let’s say you have acquired both metabolomics and metagenomic data from stool samples. Then you apply your favorite correlation or co-occurrence analysis and you find out that microorganism “x” is associated with unknown molecule “y.” Is this real or just a spurious association? Now, you can search for the unknown molecule y and maybe observe that it was actually previously detected in a culture of microorganism x. This represents an orthogonal validation which will hopefully help researchers better understand the relationship between microbes and their metabolites.

How do you see microbeMASST evolving?
 

microbeMASST is exclusively based on public data. The more data the mass spec community deposits, the more powerful it will become. Unfortunately, researchers do not always deposit data associated with their work – something that, on the contrary, routinely happens in the genomic field – but this is changing fast. Open science dramatically speeds up innovation and discovery. 

Additionally, Dr. Yasin El Abiead and Dr. Ming Wang are actively working on integrating and harmonizing the other major public mass spectrometry data repositories – Metabolights and Metabolomics Workbench – so, in the future, no matter where you deposit your data, it will be easily integrated and reusable for large scale analyses. Right now, we can only connect metabolites to taxonomy but, in the future, we aim to be able to connect to phylogeny.

Finally, our lab is also actively developing other domain-specific MASSTs to reach communities of interest, such as the one based on plant extracts. Personally, I am interested in looking at how bacterial metabolites can shape neurodevelopment and I have a couple of studies that are entering final stages. I cannot wait to share the findings.

Receive content, products, events as well as relevant industry updates from The Analytical Scientist and its sponsors.
Stay up to date with our other newsletters and sponsors information, tailored specifically to the fields you are interested in

When you click “Subscribe” we will email you a link, which you must click to verify the email address above and activate your subscription. If you do not receive this email, please contact us at [email protected].
If you wish to unsubscribe, you can update your preferences at any point.

  1. M Wang et al., Nat Biotech, 38, 23–26 (2020). DOI: 10.1038/s41587-019-0375-9. 
  2. S Zuffa et al., Nat Microbiol, 9, 336–245 (2024). DOI: 10.1038/s41564-023-01575-9. 
About the Author
Markella Loi

Associate Editor, The Analytical Scientist

Related Application Notes
Charge heterogeneity analysis of an acidic protein and identification of its proteoforms using a streamlined icIEF-UV/MS workflow

| Contributed by SCIEX

Site-specific differentiation of hydroxyproline isomers using electron activated dissociation (EAD)

| Contributed by SCIEX

High-Resolution Accurate Mass Library for Forensic Toxicology

| Contributed by Shimadzu

Related Product Profiles
ASMS 2024: Innovations Unveiled

Higher Peaks – Clearly.

| Contributed by Shimadzu Europa

Compact with countless benefits

| Contributed by Shimadzu Europa

Register to The Analytical Scientist

Register to access our FREE online portfolio, request the magazine in print and manage your preferences.

You will benefit from:
  • Unlimited access to ALL articles
  • News, interviews & opinions from leading industry experts
  • Receive print (and PDF) copies of The Analytical Scientist magazine

Register