The Mixture Matcher
Arun Moorthy and Edward Sisco share details of the mass spectral search program designed specifically for seized drug analysis using DART-MS
| 6 min read | Interview
What are the main challenges of mass spec data interpretation in general and in DART-MS?
Mass spectra are indirect measurements of molecular structure, and so data interpretation – which can also be thought of as compound identification or structure elucidation – can be challenging if you don’t know what you’re looking for. Peaks in a mass spectrum characterize the aggregate abundance of ions of specified mass-to-charge ratios, which means each peak could represent multiple substructures from the same molecule.
In traditional chromatography-based mass spectrometry approaches, you can usually be reasonably certain the peaks in a given mass spectrum are substructures of a single molecule. However, with ambient ionization mass spectrometry techniques – like DART-MS – that don’t lean on chromatography, interpretation is more difficult because each peak could be representative of multiple substructures from multiple molecules.
And that’s why you developed a data interpretation tool?
The NIST/NIJ DART-MS data interpretation tool (DIT) is a simple application that allows users working in seized drug analysis to compare their measured mass spectra – which are usually mixtures – to a library of over 1,000 compounds of interest. What differentiates the DIT from other mass spectral library search programs is that it was specifically designed to work with mixture mass spectra (those collected without chromatography). It was also designed to leverage spectra of the same mixture measured at multiple in-source collision induced dissociation (IS-CID) energy levels – which enables analysis of both molecular ions and major fragment ions.
To operate this tool, the user loads up to three IS-CID mass spectra (measurements at a low, medium, and high fragmentation level) of their unknown mixture. At the press of a button, the algorithm that underpins the DIT – the inverted library search algorithm – then looks for partial pattern matches between the query mass spectra and the pure compound library mass spectra. The results of the search are then compiled into tables that summarize possible compounds in the mixture with metrics a user can use to make decisions. It is a very focused tool with limited features, but this focus also keeps it simple to use.
Another unique aspect of the DIT is that we developed it with continuous feedback with several end-users from US Federal, State, and Local forensic laboratories, and so the specific features it does have – like report generation – are really useful to forensic practitioners. The first version of the DIT was released to the public in October 2021, and we’ve since updated the software with version 2 released in February 2022. We have plans to release a version 3 in the upcoming months.
The DIT is available for download as a packaged R Shiny Application from the NIST Public Data Repository (https://data.nist.gov/od/id/mds2-2448); the source-code is available at the same link and we hope other researchers, practitioners, and vendors will look to customize or extend the DIT to meet their needs.
You appear to have simplified something rather complex – presumably implementation was not straightforward…
Funnily enough, the biggest challenge we faced was probably “development-adoption stalemate,” which is when a potentially useful technology is stunted in its development because of an apparent “lack of market” – yet, at the same time, the shortage of users could just as accurately be attributed to a lack of development!
The NIST Mass Spectrometry Data Center regularly builds and evaluates a variety of mass spectral libraries and requires a large investment in dedicated instrumentation and staff. In 2013, after a short run, NIST stopped updating the DART-MS Forensic Database because, at the time, there just weren’t enough people using the library to justify the resources required to maintain it. But in late 2019, we suggested that people weren’t using the library because it lacked spectra of some of the newer (and very important) drugs. Additionally, the existing mass spectral data interpretation tools were designed for a very different type of analysis – like compound identification with EI mass spectra – which made them cumbersome to use with DART-MS mass spectra.
And that’s when we decided to combine our technical aptitudes, and simultaneously update the DART-MS Forensic Database with relevant compounds while also creating new algorithms and software specifically for working with IS-CID mass spectra of mixtures. By mid-2020, we had an updated library and prototype search software. We reached out to several forensic labs to see if there was any promise with what we were doing – and the feedback was incredibly positive! We then increased our efforts, acquiring new funding and expanding our team to include personnel dedicated to the development and maintenance of the library and software.
Now, we have a regularly updated DART-MS forensic database, a user-friendly software tool, and a growing number of users. It took a lot of effort, but we are incredibly proud that we were able to break through the stalemate – for now.
Have you considered other applications of the tool?
We have recently begun using the DIT as part of the Rapid Drug Analysis and Research (RaDAR) program at NIST to help public health and public safety officials across the country monitor the illicit drug landscape in near real-time.
But we can see the DIT being useful in any application area where people are using ambient ionization mass spectrometry techniques for compound identification, including and beyond DART-MS. The only real requirement is that there be an appropriately formatted search library with pure compounds of interest to the application space. Though our focus has been illicit drug analysis, we recognize there are many other application areas where this capability would be useful (such as environmental spaces, food safety, or pharmaceuticals) so we are cleaning up our library building and evaluation pipeline to make it user-friendly, with the goal of releasing it to the public within the next year.
We have also been working on incorporating additional search options in the DIT for applications where a user may not be looking to identify mixture components per se, but trying to find complete matches to their mixture mass spectrum in a library of mixture mass spectra – essentially matching mass spectral fingerprints. One application that we’ve been particularly interested in is the identification of wood species in the timber trade.
Mass spectra are rich measurements with layers of information. Even for experienced mass spectrometrists, having tools that can help peel back that information can aid their mass spectral interpretation. The DIT is one of several tools that are available to help with data interpretation. We tried to streamline the application such that users of ambient ionization mass spectrometry techniques like DART-MS could work with our library (and future libraries) with as few challenges as possible. We look forward to seeing it applied across new areas in the future – to that end, if anyone has questions or would like to collaborate on building new is-CID mass spectral libraries and extending the DIT, they can contact us at [email protected].
Arun Moorthy is a Research Scientist at the Mass Spectrometry Data Center, Biomolecular Measurement Division, National Institute of Standards and Technology, Gaithersburg, MD, USA.
Edward Sisco is a Research Chemist at the Surface and Trace Chemical Analysis Group, Materials Measurement Science Division, National Institute of Standards and Technology, Gaithersburg, MD, USA.
- E Sisco et al., (2021), NIST/NIJ DART-MS Data Interpretation Tool, National Institute of Standards and Technology, doi.org/10.18434/mds2-2448