Defying the Data Tsunami
Sitting Down With… Lutgarde Buydens, Professor, Analytical Chemistry: Chemometrics, Radboud University, Nijmegen, The Netherlands.
Your cover feature – “Towards Tsunami-Resistant Chemometrics” – stirred up quite a bit of interest…
Yes, it did. I’ve had a lot of responses, especially at conferences. Now, “tsunami” is used regularly to describe large amounts of data... That’s really quite remarkable.
Everyone is beginning to realize that large amounts of data present problems – the big data issue has exacerbated the situation. Governments and large science organizations see it as one of the major issues that need addressing through “data science”, a new umbrella term that includes chemometrics. In fact, I now tag on “chemical data science” whenever I use chemometrics as it helps those unacquainted with the field. But if we stop using chemometrics, we lose part of our identity.
Does increasing commercial interest help boost awareness?
Well, consider any infrared spectroscopy instrument; you wouldn’t expect to see one without PLS software onboard – but that’s only good when people realize it! Other analytical instrument manufacturers offer free downloadable software packages to preprocess and extract data in the correct format. Hopefully, more usage and discussion will raise awareness of our field’s importance.
The data tsunami is a tricky problem…
It’s certainly not easy! But we are beginning to see efforts to develop robust data fusion methods able to handle numerous, very large datasets. The objective is to extract the variations in the data relevant to the problem. But it’s tricky – and chemometricians are publishing more and more papers that try to provide solutions. Data fusion is a really hot area, right now.
And these methods provide ways to clean up the data?
Not exactly. Cleaning suggests the majority of your data is relevant, so you’d just remove part of the data. However, when we deal with big data, it’s the opposite – we are faced with finding the needle in the haystack. Indeed, 99.999 percent of the variation is probably not relevant to your problem. Cleaning up the data is something that was very typical in classical chemometrics with normal size datasets where you have irrelevant “noise” that you can ignore. But, with big data, we have to throw away the haystack, which is something quite different to cleaning data. The key may be ‘less is more’.
Where else can chemometrics have an impact?
Chemometrics could be the kickstarter for a whole new range of instrumentation. The widely lauded added value that chemometrics has brought to the spectrometer may be extended to many other fingerprinting methods, such as metal oxide sensing, fluorescence and low-field NMR. It may also be essential to extend the functionality of established methodology, for example, flow cytometry. I believe there is a bright future for chemometrics in instrument development – and Jeroen Jansen, an assistant professor in my department, is very much involved.
The group currently consists of post docs, PhD, masters and bachelors students. We also host two visiting scientists from industry (MSD and Douwe Egberts) who provide the essential support to keep chemometrics attached to the industrial workflow, which has been the inspiration for so many cornerstone developments in the field. I am constantly amazed by the enthusiasm, creativity and motivation of our bachelor and masters students, which bring in a lot of complementary expertise from areas like biology and informatics. They are a lot of fun to work with, and it is a joy to see them grow in the field as emerging specialists.
What about your own beginnings and mentors?
I actually began in a very different field: pharmacy (in Brussels), but later I embarked on an internship and a PhD with Desiré Luc Massart – one of the founding fathers of chemometrics. The actual logistics of studying in Brussels (I had to change from one campus to another twice daily) meant I couldn’t do proper experiments; nor was I particularly gifted at experimental work! Instead, I became fascinated with chemometric techniques such as QSAR (quantitative structure–activity relationships), which is related to pharmacy. I loved being able to look at the data and predict outcomes.
So given my newfound interest, I was lucky to later find myself under the careful watch of Massart and also Leo Kaufman, a renowned statistician. I learnt a lot from them both – new methods, such as pattern recognition, PLS (partial least squares), and all kinds of multivariate regressions. In time, I was able to extend my work to all kinds of other data and it has continued from there.
You have strong views on collaboration...
That’s right; I have to collaborate to be able to do research! One example is ESPRIT (expert systems for chemical analysis) – a project that brought together people from the University of Brussels and University of Nijmegen. In fact, that’s how I ended up in The Netherlands. There’s also COAST, which is a private/public consortium that brings together major chemical companies and academia to develop research programs in the field of analytical chemistry. I really like the fact it makes a strong link between chemometrics and analytical chemistry. Industrial partners provide us with datasets that exemplify specific problems and we use them for fundamental research, which ultimately provides a solution for their practical problem. Who says you can’t have fundamental and applied science working together? The one cannot exist without the other!
“During my PhD on drug activity, I realized that much more information is hidden in experiments than is obvious at first sight,” says Lutgarde Buydens. As a student of Désiré Luc Massart, one of the founders of chemometrics, data analysis and interpretation quickly became her main focus. After postdocing in the US, Buydens joined Radboud University in the Netherlands, where she is now chair in Analytical Chemistry – Chemometrics. Her interest in better data analysis and interpretation is as strong as ever. “To help discover new knowledge in so many different fields, from food science, medical sciences to industrial processes still excites me every day,” she says.