Introduction to robust transformations in linear regression


Several analyses of regression datasets can be improved by using a transformation of the response, rather than the original response itself, in the analysis of the data. More specifically the transformation may improve the approximate normality or the homogeneity of the errors. In a lot of examples there are physical reasons why a transformation might be expected to be helpful. For instance if the response is a non negative variable, cannot be subject to additive errors of constant variance.

In this part of the toolbox we consider the parametric family of power transformations introduced by Box and Cox (1964). A full discussion is given by Atkinson Riani (2000). Given that the estimated transformation and related test statistic may be sensitive to the presence of one, or several, outliers, we use the forward search to see how the estimates and statistics evolve as we move through the ordered data. As the user will see, influential observations may only be evident for some transformations of the data. Since observations that appear as outlying in untransformed data may not be outlying once the data have been transformed, and vice versa, we employ the forward search on data subject to various transformations, as well as on untransformed data.