Subscribe to Newsletter
Techniques & Tools Microscopy, Spectroscopy

How to Score with Penalties

Calibration for an analyte using spectroscopic techniques requires a model: the mathematical relationship between the analyte concentration, for example, and the instrumental signal. Once a model is obtained, it can be used to predict future samples. A spectral calibration model can be obtained by univariate regression, if an appropriate sensor (for example, wavelength) can be identified, or by multivariate regression. The univariate model is based on the common least squares (LS) criterion, minimizing the sum of the squares of residuals or the degree of fit. This is the trendline command in Excel many are familiar with. It is also the same measure used in multivariate regression methods, such as partial least squares (PLS); however, with PLS, projections of the measured data are used instead of the actual measured data. An upshot of the PLS projections is that the size of the regression vector shrinks to lower the variance relative to the ordinary LS (OLS) solution. The regression vector magnitude depends on the number of latent variables (LVs) used in a projection, which can be considered the PLS discrete tuning parameter.

With a variety of penalties, the regression model can be targeted to desirable solutions as well as away from undesirable solutions.

The multivariate penalty regression method known as ridge regression (RR) takes a different approach. In addition to minimizing the LS criterion, it also specifically minimizes a penalty on the size of the regression vector. The tuning parameter for RR is continuous and weights this penalty. A greater weight results in a smaller regression vector. Because of the large number of possible models to select from, RR has not seen the popularity that PLS has. However, there are now powerful fusion methods available that can select an acceptable RR model as well as a PLS model for a given set of calibration samples.

Moving beyond RR, many other penalties of various types have recently been studied – and more are being actively proposed and investigated with different purposes in mind. With a variety of penalties, the regression model can be targeted to desirable solutions as well as away from undesirable solutions. Each penalty is accompanied by its respective tuning parameter that needs to be optimized.

Penalties can also be tailored for specific purposes. An area seeing a large increase in novel uses of penalties is calibration maintenance (model updating or more generally, domain adaption) as seen by recent papers published by Google. In analytical chemistry, model updating generally means that a model has first been formed from a set of calibration samples based on an inherent set of primary measurement conditions (physical and chemical matrix effects, instrument, environment, and so on). What if we need the primary model to predict samples in new measurement conditions (secondary)? For example, a model may have been formed to predict the soil organic content for a specific geographic region or soil type using a handheld near-infrared spectrometer, but it may be necessary for the model to predict samples from a different geographic region – the secondary conditions or new domain. Such model update problems can be solved using penalty regression methods.

As noted, various penalties have been or are being studied, including penalties that invoke sparseness in the regression vector (wavelength selection). New approaches are being proposed to perform model updating with only unlabeled data from the secondary conditions (spectra with no reference values).

Many other penalties of various types have recently been studied – and more are being actively proposed and investigated with different purposes in mind.

An upcoming issue of the Journal of Chemometrics is devoted to penalty methods, due around May next year. It includes papers on using unique penalties for novel model updating approaches using a variety of spectral situations, greater image resolution for fluorescence microscopy, a tuning parameter independent support vector machine, experimental design and improved sparsity approaches, as well as enhanced 3D spectral data analysis.

In my view, penalty methods are the future and will make spectroscopic model updating a reality.

Receive content, products, events as well as relevant industry updates from The Analytical Scientist and its sponsors.
Stay up to date with our other newsletters and sponsors information, tailored specifically to the fields you are interested in

When you click “Subscribe” we will email you a link, which you must click to verify the email address above and activate your subscription. If you do not receive this email, please contact us at [email protected].
If you wish to unsubscribe, you can update your preferences at any point.

About the Author
John H. Kalivas

John H. Kalivas completed his doctorate in chemistry with analytical chemistry and chemometric focuses at the University of Washington in 1982 under the direction of Professor Bruce R. Kowalski. He joined the Department of Chemistry at Idaho State University in 1985 as an Assistant Professor and was promoted to Professor in 1994. He is author or coauthor of over 100 professional publications including papers, book chapters, and books. He serves as an Editor for the Journal of Chemometrics and an Associate Editor for Applied Spectroscopy. He also serves on the Editorial Boards of Analytical Letters and Talanta.

Register to The Analytical Scientist

Register to access our FREE online portfolio, request the magazine in print and manage your preferences.

You will benefit from:
  • Unlimited access to ALL articles
  • News, interviews & opinions from leading industry experts
  • Receive print (and PDF) copies of The Analytical Scientist magazine