The “fan plot” displays the values of the score test statistic during the forward search for different values of the Box-Cox transformation parameter λ. If the size of the data set is small (i.e. smaller than 100) generally it is enough to consider the five most common values of λ : −1,−0.5, 0, 0.5 and 1, otherwise a finer grid of values of λ may be needed.

In the fan plot we perform a separate search for each value of λ which is tested. The data are transformed and a starting point is found for each forward search, which then proceeds independently for each values of λ using the transformed data.

The fan plot allows assessment of the proportion of the data supporting a particular transformation, information not available from other methods of analysis.

Suppose you want to produce the fan plot for the wool data.

% Load wool data: 27 observations and 3 explanatory variables XX=load('wool.txt'); y=XX(:,end); X=XX(:,1:end-1); % Function FSRfan stores the score test statistic. %In this case the five most common values of lambda are considered [out]=FSRfan(y,X); % Produce a fan plot and display it on the screen fanplot(out);

The fan plot appears as follows

Initially
there is no evidence against any transformation. When the subset size *m*
is equal to
15 (56% of the data), λ = 1 starts to be rejected. The next rejections are λ =
0.5 at 67% and −1 at 74%. The value of λ = 0 is supported not only
by all the data, but also by our sequence of subsets. The observations
added during the search depend on the transformation. In general, if the
data require transformation and are not transformed, or are insufficiently
transformed, large observations will appear as outliers. Conversely, if the
data are overtransformed, small observations will appear as outliers. This
is exactly what happens here. For λ = 1 and λ = 0.5, working back from
m = 27, the last cases to enter the subset are 19, 20 and 21, which are
the three largest observations. Conversely, for λ = −1 and λ = −0.5 case
9 is the last to enter, preceded by 8 and 7, which are the three smallest
observations. Since the data are in standard order for a 33 factorial, the
patterns of these numbers indicate a systematic failure of the model. For
the log transformation, which produces normal errors, there is no particular
pattern to the order in which the observations enter the forward search.

Notice that using the option `datatooltip` it is possible to see which
units enter the search for each particular step. For example, with the code

fanplot(out,'datatooltip',1);

once the user clicks on step *m*=25 in the search for λ = 1 the tooltip
shows the following information.

Usng the option databrush it is immediately possible to see the position of the brushed units in the scatter plot matrix and theire resiudual in the monitoring residuals plots (see code and Figure below).

fanplot(out,'databrush',1);

In conclusion: every time a brushing action is performed on the fan plot, it is possible to display in an automatic way also the information about the position of the brushed units in the scatter diagram of y against the required explanatory variable(s) and to immediately visualize the residuals in the monitoring residuals plot.

Note In order to label the
residuals which are brushed in the fan plot, the following
code can be used:
fanplot(out,'databrush',{'Label' 'on' 'RemoveLabels' 'off'}); |