resindexplot

resindexplot plots the residuals from a regression analysis versus index number or any other variable

Syntax

• resindexplot(residuals)example
• resindexplot(residuals,Name,Value)example

Description

resindexplot(residuals) Residual plot of 100 random numbers.

resindexplot(residuals, Name, Value) Compare OLS residuals with robust residuals for the stack loss data.

Examples

expand all

Residual plot of 100 random numbers.

resindexplot(randn(100,1))

Compare OLS residuals with robust residuals for the stack loss data.

y=stack_loss(:,4);
X=stack_loss(:,1:3);
% Define confidence level
conflev=[0.95,0.99];
figure;
h1=subplot(2,1,1);
% Compute studentized residuals (deletion residuals)
stats=regstats(y,X,'linear',{'standres','studres'});
resindexplot(stats.studres,'h',h1,'conflev',conflev,'labx','Index number','laby','Deletion residuals');
% Compute robust residuals
[out]=LXS(y,X,'nsamp',0,'rew',1,'lms',0);
h2=subplot(2,1,2);
resindexplot(out.residuals,'h',h2,'conflev',conflev,'labx','Index number','laby','Robust LTS reweighted residuals');

Related Examples

expand all

Just plot robust residuals.

y=stack_loss(:,4);
X=stack_loss(:,1:3);
[out]=LXS(y,X,'nsamp',0,'rew',1,'lms',0);
bonfconf = 1-0.01/size(y,1);    % 99% Bonferronised
resindexplot(out.residuals,'conflev',[0.95,0.99,bonfconf],'labx','Index number','laby','Robust LTS reweighted residuals');

Interactive example 1.

databrush=struct;
databrush.selectionmode='Brush'; % Brush selection
databrush.persist='on'; % Enable repeated mouse selections
databrush.Label='on'; % Write labels of the units while selecting
databrush.RemoveLabels='on'; % Remove labels after selection
databrush.RemoveTool    = 'on'; % Remove yellow tool after selection
databrush.RemoveFlagged = 'on'; % Remove filled red color for selected points after selection
y=stack_loss(:,4);
X=stack_loss(:,1:3);
[out]=LXS(y,X,'rew',1,'lms',0,'yxsave',1);
resindexplot(out,'databrush',databrush)
[outFS]=FSReda(y,X,out.bs);
resfwdplot(outFS,'databrush',databrush)

Example of usage of option numlab.

Write the row number for the units which have the 3 largest residuals (in absolute value)

y=stack_loss(:,4);
X=stack_loss(:,1:3);
[out]=LXS(y,X,'nsamp',1000);
resindexplot(out.residuals,'numlab',{3});

First example in which numlab is passed as structure.

In this case we control the FontSize of the associated labels.

numlab=struct;
% Set a font size for the labels equal to 20
numlab.FontSize=20;
resindexplot(randn(100,1),'numlab',numlab)

Second example in which numlab is passed as structure.

In this case we control both the number of units to label and also the FontSize of the associated labels.

numlab=struct;
% Show just the two most important residuals.
numlab.numlab={2};
% Set a font size for the labels equal to 20
numlab.FontSize=20;
resindexplot(randn(100,1),'numlab',numlab)

Input Arguments

residuals — residuals to plot. Numeric vector or structure.

If residuals is a vector it contains the n residuals.

If residuals is a structure it contains the following fields

Value Description
residuals

vector of residuals (compulsory field)

y

response (compulsory field if interactive brushing is used)

X

n-by-p matrix containing explanatory variables(compulsory field if interactive brushing is used)

Data Types: single|double

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'h',h1 where h1=subplot(2,1,1) ,'x',1:100 ,'labx','row index' ,'laby','scaled residuals' ,'title','scaled residuals' ,'numlab',[3,10,35] ,'conflev',[0.95,0.99,0.999] ,'Fontsize',10 ,'SizeAxesNum',10 ,'ylimy',[-5 5] ,'xlimx',[-5 5] ,'lwdenv',2 ,'MarkerSize',10 ,'MarkerFaceColor','b' , 'databrush',1 , 'nameX',{'Age','Income','Married','Profession'} , 'namey','response' , 'tag','indexPlot'

h —the axis handle of a figure where to send the resindexplot.this can be used to host the resindexplot in a subplot of a complex figure formed by different panels (for example a panel with residuals from a classical ols estimator and another with residuals from a robust regression: see example below).

Example: 'h',h1 where h1=subplot(2,1,1)

Data Types: Axes object (supplied as a scalar)

x —the vector to be plotted on the x-axis.numeric vector.

As default the sequence 1:length(residuals) will be used.

Example: 'x',1:100

Data Types: double

labx —a label for the x-axis.character.

(default: '')

Example: 'labx','row index'

Data Types: char

laby —a label for the y-axis.character.

(default: '')

Example: 'laby','scaled residuals'

Data Types: char

title —a label containing the title of the plot.character.

Default value is 'Index plot of residuals'

Example: 'title','scaled residuals'

Data Types: char

numlab —number of points to be identified in plots.[] | cell ({5}) default) | numeric vector | structure.

NUMLAB IS A CELL.

If numlab is a cell containing scalar k, the units with the k largest residuals are labelled in the plots.

The default value of numlab is {5}, that is the units with the 5 largest residuals are labelled.

For no labelling leave it empty.

NUMLAB IS A VECTOR.

If numlab is a vector, the units inside vector numlab are labelled in the plots.

NUMLAB IS A STRUCTURE.

If numlab is a struct it is possible to control the size of the points identified. It contains the following fields:

Value Description
numlab

number of points to be identified (cell or vector, see above);

FontSize

fontsize of the labels of the points. The default value is 12.

Example: 'numlab',[3,10,35]

Data Types: double

conflev —confidence interval for the horizontal bands.numeric vector.

It can be a vector of different confidence level values.

Remark: confidence interval is based on the chi^2 distribution

Example: 'conflev',[0.95,0.99,0.999]

Data Types: double

FontSize —Scalar which controls the fontsize of the labels of the axes.default value is 12.

Example: 'Fontsize',10

Data Types: double

SizeAxesNum —Scalar which controls the fontsize of the numbers of the axes.default value is 10.

Example: 'SizeAxesNum',10

Data Types: double

ylimy —Vector with two elements which controla minimum and maximum value of the y axis.default is '', automatic scale.

Example: 'ylimy',[-5 5]

Data Types: double

xlimx —Vector with two elements controlling minimum and maximum on the x axis.default value is '' (automatic scale).

Example: 'xlimx',[-5 5]

Data Types: double

lwdenv —width of the lines associated with the envelopes.scalar.

Default is lwdenv=1.

Example: 'lwdenv',2

Data Types: double

MarkerSize —size of the marker in points.scalar.

The default value for MarkerSize is 6 points (1 point = 1/72 inch).

Example: 'MarkerSize',10

Data Types: double

MarkerFaceColor —Marker fill color.'none' | 'auto' | RGB triplet | color string.

Fill color for markers that are closed shapes (circle, square, diamond, pentagram, hexagram, and the four triangles).

Example: 'MarkerFaceColor','b'

Data Types: char

databrush —interactive mouse brushing.empty value, scalar | structure.

If databrush is an empty value (default), no brushing is done. The activation of this option (databrush is a scalar or a cell) enables the user to select a set of trajectories in the current plot and to see them highlighted in the y|X plot, i.e. a matrix of scatter plots of y against each column of X, grouped according to the selection(s) done by brushing. If the plot y|X does not exist, it is automatically created. Please, note that the window style of the other figures is set equal to that which contains the monitoring residual plot. In other words, if the monitoring residual plot is docked all the other figures will be docked too.

DATABRUSH IS A SCALAR. If databrush is a scalar the default selection tool is a rectangular brush and it is possible to brush only once (that is persist='').

DATABRUSH IS A STRUCTURE. If databrush is a structure, it is possible to use all optional arguments of function selectdataFS and the following fields -

Value Description
persist

repeated brushng enabled. Persist is an empty value or a scalar containing the strings 'on' or 'off'.

The default value of persist is '', that is brushing is allowed only once.

If persist is 'on' or 'off' brushing can be done as many time as the user requires.

If persist='on' then the unit(s) currently brushed are added to those previously brushed. it is possible, every time a new brushing is done, to use a different color for the brushed units.

If persist='off' every time a new brush is performed units previously brushed are removed.

Character. [] (default) | '1'.

If databrush.labeladd='1', we label the units of the last selected group with the unit row index in matrices X and y. The default value is labeladd='', i.e. no label is added.

bivarfit

this option adds one or more least square lines based on SIMPLE REGRESSION to the plots of y|X, depending on the selected groups.

bivarfit = '' is the default: no line is fitted.

bivarfit = '1' fits a single ols line to all points of each bivariate plot in the scatter matrix y|X.

bivarfit = '2' fits two ols lines: one to all points and another to the last selected group. This is useful when there are only two groups, of which one refers to a set of potential outliers.

bivarfit = '0' fits one ols line for each selected group. This is useful for the purpose of fitting mixtures of regression lines.

bivarfit = 'i1' or 'i2' or 'i3' etc.

fits a ols line to a specific group, the one with index 'i' equal to 1, 2, 3 etc.

- databrush. multivarfit = this option adds one or more least square lines, based on MULTIVARIATE REGRESSION of y on X, to the plots of y|Xi.

multivarfit = '' is the default: no line is fitted.

multivarfit = '1' fits a single ols line to all points of each bivariate plot in the scatter matrix y|X.

The line added to the scatter plot y|Xi is avconst +Ci*Xi, where Ci is the coefficient of Xi in the multivariate regression and avconst is the effect of all the other explanatory variables different from Xi evaluated at their centroid (that is overline{y}'C)) multivarfit = '2' exactly equal to multivarfit ='1' but this time we add the line based on the group of unselected observations.

- databrush.labeladd = if this option is '1', we label the units of the last selected group with the unit row index in matrices X and y. The default value is labeladd='', i.e. no label is added.

Example: 'databrush',1

Data Types: single | double | struct

nameX —regressor labels.cell array of strings of length p containing the labels of the variables of the regression dataset.

If it is empty (default) the sequence X1, ..., Xp will be created automatically.

Example: 'nameX',{'Age','Income','Married','Profession'}

Data Types: cell

namey —response label.character.

Character containing the label of the response. If it is empty (default) label 'y' will be used.

Example: 'namey','response'

Data Types: char

tag —Figure tag.character.

Tag of the figure which will host the malindexplot. The default tag is pl_resindex

Example: 'tag','indexPlot'

Data Types: character

Output Arguments

Rousseeuw P.J., Leroy A.M. (1987), "Robust regression and outlier detection", Wiley.