resindexplot plots the residuals from a regression analysis versus index number or any other variable
resindexplot(
Compare OLS residuals with robust residuals for the stack loss data.residuals
,
Name, Value
)
load('stack_loss.txt'); y=stack_loss(:,4); X=stack_loss(:,1:3); % Define confidence level conflev=[0.95,0.99]; figure; h1=subplot(2,1,1); % Compute studentized residuals (deletion residuals) stats=regstats(y,X,'linear',{'standres','studres'}); resindexplot(stats.studres,'h',h1,'conflev',conflev,'labx','Index number','laby','Deletion residuals'); % Compute robust residuals [out]=LXS(y,X,'nsamp',0,'rew',1,'lms',0); h2=subplot(2,1,2); resindexplot(out.residuals,'h',h2,'conflev',conflev,'labx','Index number','laby','Robust LTS reweighted residuals');
load('stack_loss.txt'); y=stack_loss(:,4); X=stack_loss(:,1:3); [out]=LXS(y,X,'nsamp',0,'rew',1,'lms',0); bonfconf = 1-0.01/size(y,1); % 99% Bonferronised resindexplot(out.residuals,'conflev',[0.95,0.99,bonfconf],'labx','Index number','laby','Robust LTS reweighted residuals');
databrush=struct; databrush.selectionmode='Brush'; % Brush selection databrush.persist='on'; % Enable repeated mouse selections databrush.Label='on'; % Write labels of the units while selecting databrush.RemoveLabels='on'; % Remove labels after selection databrush.RemoveTool = 'on'; % Remove yellow tool after selection databrush.RemoveFlagged = 'on'; % Remove filled red color for selected points after selection load('stack_loss.txt'); y=stack_loss(:,4); X=stack_loss(:,1:3); [out]=LXS(y,X,'rew',1,'lms',0,'yxsave',1); resindexplot(out,'databrush',databrush) [outFS]=FSReda(y,X,out.bs); resfwdplot(outFS,'databrush',databrush)
Write the row number for the units which have the 3 largest residuals (in absolute value)
load('stack_loss.txt'); y=stack_loss(:,4); X=stack_loss(:,1:3); [out]=LXS(y,X,'nsamp',1000); resindexplot(out.residuals,'numlab',{3});
In this case we control the FontSize of the associated labels.
numlab=struct; % Set a font size for the labels equal to 20 numlab.FontSize=20; resindexplot(randn(100,1),'numlab',numlab)
In this case we control both the number of units to label and also the FontSize of the associated labels.
numlab=struct; % Show just the two most important residuals. numlab.numlab={2}; % Set a font size for the labels equal to 20 numlab.FontSize=20; resindexplot(randn(100,1),'numlab',numlab)
residuals
— residuals to plot.
Numeric vector or structure.If residuals is a vector it contains the n residuals.
If residuals is a structure it contains the following fields
Value | Description |
---|---|
residuals |
vector of residuals (compulsory field) |
y |
response (compulsory field if interactive brushing is used) |
X |
n-by-p matrix containing explanatory variables(compulsory field if interactive brushing is used) |
Data Types: single|double
Specify optional comma-separated pairs of Name,Value
arguments.
Name
is the argument name and Value
is the corresponding value. Name
must appear
inside single quotes (' '
).
You can specify several name and value pair arguments in any order as
Name1,Value1,...,NameN,ValueN
.
'conflev',[0.95,0.99,0.999]
, 'databrush',1
,'Fontsize',10
,'h',h1 where h1=subplot(2,1,1)
,'labx','row index'
,'laby','scaled residuals'
,'lwdenv',2
,'MarkerSize',10
,'MarkerFaceColor','b'
, 'nameX',{'Age','Income','Married','Profession'}
, 'namey','response'
,'numlab',[3,10,35]
,'SizeAxesNum',10
, 'tag','indexPlot'
,'title','scaled residuals'
,'x',1:100
,'xlimx',[-5 5]
,'ylimy',[-5 5]
conflev
—confidence interval for the horizontal bands.numeric vector.It can be a vector of different confidence level values.
Remark: confidence interval is based on the chi^2 distribution
Example: 'conflev',[0.95,0.99,0.999]
Data Types: double
databrush
—interactive mouse brushing.empty value, scalar | structure.If databrush is an empty value (default), no brushing is done. The activation of this option (databrush is a scalar or a cell) enables the user to select a set of trajectories in the current plot and to see them highlighted in the y|X plot, i.e. a matrix of scatter plots of y against each column of X, grouped according to the selection(s) done by brushing. If the plot y|X does not exist, it is automatically created. Please, note that the window style of the other figures is set equal to that which contains the monitoring residual plot. In other words, if the monitoring residual plot is docked all the other figures will be docked too.
DATABRUSH IS A SCALAR. If databrush is a scalar the default selection tool is a rectangular brush and it is possible to brush only once (that is persist='').
DATABRUSH IS A STRUCTURE. If databrush is a structure, it is possible to use all optional arguments of function selectdataFS and the following fields -
Value | Description |
---|---|
persist |
repeated brushng enabled. Persist is an empty value or a scalar containing the strings 'on' or 'off'. The default value of persist is '', that is brushing is allowed only once. If persist is 'on' or 'off' brushing can be done as many time as the user requires. If persist='on' then the unit(s) currently brushed are added to those previously brushed. it is possible, every time a new brushing is done, to use a different color for the brushed units. If persist='off' every time a new brush is performed units previously brushed are removed.
|
labeladd |
add labels of brushed units. Character. [] (default) | '1'. If databrush.labeladd='1', we label the units of the last selected group with the unit row index in matrices X and y. The default value is labeladd='', i.e. no label is added.
|
bivarfit |
this option adds one or more least square lines based on SIMPLE REGRESSION to the plots of y|X, depending on the selected groups. bivarfit = '' is the default: no line is fitted. bivarfit = '1' fits a single ols line to all points of each bivariate plot in the scatter matrix y|X. bivarfit = '2' fits two ols lines: one to all points and another to the last selected group. This is useful when there are only two groups, of which one refers to a set of potential outliers. bivarfit = '0' fits one ols line for each selected group. This is useful for the purpose of fitting mixtures of regression lines. bivarfit = 'i1' or 'i2' or 'i3' etc. fits a ols line to a specific group, the one with index 'i' equal to 1, 2, 3 etc. - databrush. multivarfit = this option adds one or more least square lines, based on MULTIVARIATE REGRESSION of y on X, to the plots of y|Xi. multivarfit = '' is the default: no line is fitted. multivarfit = '1' fits a single ols line to all points of each bivariate plot in the scatter matrix y|X. The line added to the scatter plot y|Xi is avconst +Ci*Xi, where Ci is the coefficient of Xi in the multivariate regression and avconst is the effect of all the other explanatory variables different from Xi evaluated at their centroid (that is overline{y}'C)) multivarfit = '2' exactly equal to multivarfit ='1' but this time we add the line based on the group of unselected observations. - databrush.labeladd = if this option is '1', we label the units of the last selected group with the unit row index in matrices X and y. The default value is labeladd='', i.e. no label is added. |
Example: 'databrush',1
Data Types: single | double | struct
FontSize
—Scalar which controls the fontsize of the labels of the
axes.default value is 12.
Example: 'Fontsize',10
Data Types: double
h
—the axis handle of a figure where to send the resindexplot.this can be used to host the resindexplot in a subplot of a complex figure formed by different panels (for example a panel with residuals from a classical ols estimator and another with residuals from a robust regression: see example below).
Example: 'h',h1 where h1=subplot(2,1,1)
Data Types: Axes object (supplied as a scalar)
laby
—a label for the y-axis.character.(default: '')
Example: 'laby','scaled residuals'
Data Types: char
lwdenv
—width of the lines associated
with the envelopes.scalar.Default is lwdenv=1.
Example: 'lwdenv',2
Data Types: double
MarkerSize
—size of the marker in points.scalar.The default value for MarkerSize is 6 points (1 point = 1/72 inch).
Example: 'MarkerSize',10
Data Types: double
MarkerFaceColor
—Marker fill color.'none' | 'auto' | RGB triplet | color string.Fill color for markers that are closed shapes (circle, square, diamond, pentagram, hexagram, and the four triangles).
Example: 'MarkerFaceColor','b'
Data Types: char
nameX
—regressor labels.cell array of strings of length p containing the labels of the variables of the regression dataset.If it is empty (default) the sequence X1, ..., Xp will be created automatically.
Example: 'nameX',{'Age','Income','Married','Profession'}
Data Types: cell
namey
—response label.character.Character containing the label of the response. If it is empty (default) label 'y' will be used.
Example: 'namey','response'
Data Types: char
numlab
—number of points to be identified in plots.[] | cell ({5}) default) | numeric vector | structure.NUMLAB IS A CELL.
If numlab is a cell containing scalar k, the units with the k largest residuals are labelled in the plots.
The default value of numlab is {5}, that is the units with the 5 largest residuals are labelled.
For no labelling leave it empty.
NUMLAB IS A VECTOR.
If numlab is a vector, the units inside vector numlab are labelled in the plots.
NUMLAB IS A STRUCTURE.
If numlab is a struct it is possible to control the size of the points identified. It contains the following fields:
Value | Description |
---|---|
numlab |
number of points to be identified (cell or vector, see above); |
FontSize |
fontsize of the labels of the points. The default value is 12. |
Example: 'numlab',[3,10,35]
Data Types: double
SizeAxesNum
—Scalar which controls the fontsize of the numbers of
the axes.default value is 10.
Example: 'SizeAxesNum',10
Data Types: double
tag
—Figure tag.character.Tag of the figure which will host the resindexplot. The default tag is pl_resindex. This implies that if you call twice function resindexplot without specifying the tag, the second plot will overwrite the first. In order to have to figures with resindexplot, please use option tag.
Example: 'tag','indexPlot'
Data Types: character
title
—a label containing the title of the plot.character.Default value is 'Index plot of residuals'
Example: 'title','scaled residuals'
Data Types: char
x
—the vector to be plotted on the x-axis.numeric vector.As default the sequence 1:length(residuals) will be used.
Example: 'x',1:100
Data Types: double
xlimx
—Vector with two elements controlling minimum and maximum
on the x axis.default value is '' (automatic scale).
Example: 'xlimx',[-5 5]
Data Types: double
ylimy
—Vector with two elements which controla minimum and maximum
value of the y axis.default is '', automatic scale.
Example: 'ylimy',[-5 5]
Data Types: double
Rousseeuw P.J., Leroy A.M. (1987), "Robust regression and outlier detection", Wiley.