MIT researchers have developed a physics-informed generative AI tool that can predict a material’s spectrum across different spectroscopy techniques – without requiring direct measurement. Dubbed SpectroGen, the model generates synthetic spectral data that closely matches experimentally acquired results, potentially streamlining materials verification for batteries, semiconductors, and pharmaceuticals.
Published in Matter, the study outlines how SpectroGen uses a variational autoencoder (VAE) framework augmented with distribution-based physical priors – namely, Gaussian and Lorentzian waveform models – to interpret and synthesize spectra. Unlike traditional methods that require detailed chemical-bond knowledge, the model instead treats spectral signatures as mathematical waveforms.
“It’s a physics-savvy generative AI that understands what spectra are,” said Loza Tadesse, assistant professor of mechanical engineering at MIT and co-author of the study, in a press release. “We interpreted spectra not as how it comes about from chemicals and bonds, but that it is actually math – curves and graphs, which an AI tool can understand and interpret.”
The tool was trained on a dataset of over 6,000 mineral samples, many with associated X-ray, infrared, and Raman spectra. By learning correlations across modalities, SpectroGen can accept input from one method – such as infrared – and generate a highly accurate prediction of the same material’s response in another, such as X-ray diffraction. According to the study, the AI-generated spectra match the measured data with 99 percent accuracy, generated in under a minute.
The team sees SpectroGen as a potential virtual spectrometer for industry. For instance, manufacturers could perform inexpensive infrared scans and use SpectroGen to infer Raman or X-ray spectra – bypassing the need for multiple costly instruments. “We think that you don’t have to do the physical measurements in all the modalities you need, but perhaps just in a single, simple, and cheap modality,” said Tadesse.
The researchers are now exploring use cases in disease diagnostics and agricultural monitoring, and are commercializing the platform via a startup effort. Future directions also include adapting SpectroGen to AI-driven robotic labs and integrated sensing platforms.
