Where is the MSI Software Messiah?
Common data file formats are readily available for mass spectrometry imaging (MSI) data. Now, the community must agree on an extensible cross-platform software solution that everyone can use.
Guillaume Robichaud, David C. Muddiman, Kenneth P. Garrard, Jeremy Barry |
Once upon a time in the history of proteomics and liquid chromatography-mass spectrometry (LC-MS), instrument manufacturers encoded data in vendor-specific formats and offered proprietary software tools for data analysis, with limited or zero data-exporting capabilities. For a long time, there was no common data file format, making it very difficult to share, compare, and analyze MS data obtained from different platforms. And so, to perform custom data analysis, scientists would often have to reinvent the wheel, spending precious time writing programs for common tasks, such as peak picking, deconvolution, peak area calculation, and data visualization.
Initial efforts from the Seattle Proteome Center (mxXML format) and later from HUPO-Proteomic Standard Initiative (mzML format) were key contributions to the field because they provided standard, vendor-neutral formats for MS data. Most vendors joined the parade and provided support or even tools to convert proprietary data into vendor-neutral formats. A second major contribution to the field was the Proteowizard project, which provided the community with a robust, validated modular set of free, open-source tools and libraries to perform analysis of MS and proteomics data. Scientists could finally focus their time and energy on developing novel algorithms and tools that significantly advance the field.
So, what is happening within the field of MSI? More than 15 years after its introduction, there is still no widely accepted tool for viewing and performing basic data processing (peak picking, feature recognition, data extraction, normalization). Lessons learned from the history of software development in proteomics and LC-MS suggests that a common, shareable data file format is one prerequisite for such tool to exist. The imzML format has now been around for several years and has been accepted by the MSI community as the common data file format for MSI. Free, robust data file converters from vendor formats to imzML are also widely available.
But what about MSI software? After a quick census, we found 20 different MSI tools, eight of which are commercial products. The disconcerting part is that most of the free software was released in the last two years or is currently “in development”. It is safe to assume that most of the time spent coding these 12 interfaces was not spent developing novel data-processing algorithms, but instead building the user interface and implementing the same basic but necessary MSI tools. On the other hand, brilliant research is focusing on the development of new algorithms for the analysis of MSI data, such as peak picking, automatic feature extraction, data normalization, spatial segmentation, clustering, resampling, but until there is a common, open source MSI platform, they cannot be easily implemented.
Earlier this year, our group introduced MSiReader, an open-source interface for viewing and analyzing MSI data. The fully customizable interface can load most common MS/MSI data file formats, process high mass resolving power data, and comes with a plethora of great data visualization and analysis tools. Currently used by hundreds of scientists in over 50 (academic and industry) laboratories around the world, it already incorporates several important features, such as data export tools, peak extraction, batch processing, spectra viewing, normalization, image co-registration, baseline correction. Do we think MSiReader is the ultimate solution? Well, from the positive response we’ve received from the community so far, we firmly believe that it is at least a very honest attempt. However, there is still a lot of work to do before MSiReader becomes the definitive tool. In fact, we can already think of a small list of improvements: implementation of a comprehensive statistical analysis toolset, optional data resampling, 3D MSI, and co-registration with more data formats.
So, let’s all work together and make it happen! We see all feedback and user input as essential to the future development and success of MSiReader.
Guillaume Robichaud completed his bachelor of engineering at École Polytechnique de Montréal in 2003. After working five years as a Mechanical Engineering for General Electric, installing large hydroelectric turbines and generators all around the world, Guillaume attended graduate school to pursue his research interests. He received his Master's degree in mechanical engineering and is now completing a PhD in chemistry at North Carolina State University. “I’ve spent most of my graduate career implementing novel devices and software tools for bioanalytical applications in mass spectrometry,” he says.
David Muddiman received his BS in chemistry from Gannon University (Erie, PA) in 1990 and his PhD in analytical chemistry from the University of Pittsburgh in 1995. David is currently a Distinguished Professor of Chemistry and Founder and Director of the W.M. Keck Fourier Transform Mass Spectrometry Laboratory at North Carolina State University in Raleigh, NC. David’s group has published over 200 peer-reviewed articles on a wide range of topics in the field of mass spectrometry: from studying the fundamentals of novel ionization methods to the application of mass spectrometry to better understand the biology underlying complex molecular processes.
Ken Gerrard’s first professional job was in 1978 writing mainframe pharmacokinetic software for Burroughs Welcome Co. Ken was an undergraduate computer science major with a keen interest in using computers to solve real problems. “After graduate school, I spent several years developing real-time control systems as an independent consultant. When I started working at NCSU 26 years ago my first task was to build a controller for an ultra-precision machine tool. But first we had to build a multiprocessor computer and write an operating system.” For the last 20 years, his work has focused on precision fabrication and metrology of optomechanical systems. “Writing software for imaging mass spectrometry is another metrology problem. I’ve enjoyed learning enough about mass spectrometry to work on tools that help analytic chemists process and visualize large data sets.”
Jeremy Barry attended the University of New Haven in 2004 where he received bachelor’s degrees in chemistry and forensic science. From there he moved south to the University of North Carolina at Greensboro where he studied the fundamentals of chemical separations under Dr. Brent Dawson with a primary focus in capillary electrophoresis. After graduating with a master’s in analytical chemistry, he went down the road to North Carolina State University to pursue his doctoral degree under Dr. David Muddiman. “At NCSU I’ve been involved in pushing the boundaries of ambient ionization mass spectrometry through applications in pharmaceutical analysis and molecular imaging”.