GLM II: Gamma Regression

The overlooked potential of Generalized Linear Models in astronomy II: Gamma regression and photometric redshifts

GLM II: Gamma Regression

Machine learning techniques offer a precious tool box for use within astronomy to solve problems involving so-called big data. They provide a means to make accurate predictions about a particular system without prior knowledge of the underlying physical processes of the data. 

In this article, and the companion papers of this series, we present the set of Generalized Linear Models (GLMs) as a fast alternative method for tackling general astronomical problems, including the ones related to the machine learning paradigm. To demonstrate the applicability of GLMs to inherently positive and continuous physical observables, we explore their use in estimating the photometric redshifts of galaxies from their multi-wavelength photometry. Using the gamma family with a log link function we predict redshifts from the PHoto-z Accuracy Testing simulated catalogue and a subset of the Sloan Digital Sky Survey from Data Release 10. 

We obtain fits that result in catastrophic outlier rates as low as ~1% for simulated and ~2% for real data. Moreover, we can easily obtain such levels of precision within a matter of seconds on a normal desktop computer and with training sets that contain merely thousands of galaxies.
Our software is made publicly available as an user-friendly package developed in Python, R and via an interactive web application. 

This software allows users to apply a set of GLMs to their own photometric catalogues and generates publication quality plots with minimum effort from the user. By facilitating their ease of use to the astronomical community, this paper series aims to make GLMs widely known and to encourage their implementation in future large-scale projects, such as the Large Synoptic Survey Telescope.

Dawn of Stars

Dawn of Stars tells the story of how stars are formed. Most stars are born in groups which are truly