malindexplot

malindexplot plots the Mahalanobis distances versus a selected variable.

Syntax

• MCDenv=malindexplot(md,v)example
• MCDenv=malindexplot(md,v,Name,Value)example

Description

 MCDenv =malindexplot(md, v) Mahalanobis distance plot of 100 random numbers.

 MCDenv =malindexplot(md, v, Name, Value) Compare traditional md with robust md for the stack loss data.

Examples

expand all

Mahalanobis distance plot of 100 random numbers.

Numbers are from from the chi2 with 5 degrees of freedom

MCDenv=malindexplot(chi2rnd(5,100,1),5);

Compare traditional md with robust md for the stack loss data.

load('stack_loss.txt');
X=stack_loss(:,1:3);
[n,v]=size(X);
% Define confidence level
conflev=[0.95,0.99];
figure;
h1=subplot(2,1,1);
% Compute traditional Mahalanobis distances
mdtrad=mahal(X,X);
malindexplot(mdtrad,v,'h',h1,'conflev',conflev,'labx','Index number','laby','Traditional md');
% Compute robust md
[out]=FSM(X,'init',5,'plots',0);
seq=1:size(X,1);
good=setdiff(seq,out.outliers);
mdrob=mahal(X,X(good,:));
h2=subplot(2,1,2);
malindexplot(mdrob,v,'h',h2,'conflev',conflev,'labx','Index number','laby','Robust md','title','');

Related Examples

expand all

Interactive example 1. Index plot Mahalanobis distance with databrush option.

n=200;
v=3;
randn('state', 123456);
Y=randn(n,v);
% Contaminated data
Ycont=Y;
Ycont(1:5,:)=Ycont(1:5,:)+3;
[RAW,REW]=mcd(Ycont);
RAW.Y=Ycont;
malindexplot(RAW,v,'databrush',1)

Interactive example 2. Index plot Mahalanobis distance with personalized databrush option.

n=200;
v=3;
randn('state', 123456);
Y=randn(n,v);
% Contaminated data
Ycont=Y;
Ycont(1:5,:)=Ycont(1:5,:)+3;
[RAW,REW]=mcd(Ycont);
RAW.Y=Ycont;
databrush=struct;
databrush.selectionmode='Brush'; % Brush selection
databrush.persist='on'; % Enable repeated mouse selections
databrush.Label='on'; % Write labels of the units while selecting
databrush.RemoveLabels='on'; % Remove labels after selection
databrush.RemoveTool    = 'off'; % Do not remove yellow tool after selection
databrush.RemoveFlagged = 'off'; % Do not remove filled red color for selected points after selection
databrush.labeladd = '1'; % Write number of seleceted units in the scatter plot matrix
malindexplot(RAW,v,'databrush',databrush)

Input Arguments

md — Mahalanobis distances. Vector or structure.

Vector of Mahalanobis distances (in squared units) or a structure containing fields md and Y. In this second case md is a structure with the following fields:

Value Description
md

contains the Mahalanobis distances (this field is compulsory);

Y

contains the original data matrix whose Mahalanobis distances have been computed (this field is compulsory is option databrush is used).

class

this field is not compulsory. In the case of md.class='mcdCorAna' simulated envelopes are used to define the empirical quantiles. Note that if the simulated bands have been precalculated they can be passed through the second input argument v.

Data Types: single|double

v — Number of variables or matrix of size n-by-k containing empirical envelope. Scalar or matrix with the same rows of length(md).

If v is a scalar, it contains the number of variables of the original data matrix which have been used to compute md. The threshold in this case is based on the Chi^2 distribution with v degrees of freedom. If v is a matrix with size(v,1)=length(md) the empirical precalculated envelope in v are used to obtain the confidence bands.

Data Types: single|double

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as  Name1,Value1,...,NameN,ValueN.

Example:  'h',gca , 'x','1:100' , 'labx','unit number' , 'laby','MD' , 'title','Index plot of MD' , 'numlab',{3} , 'conflev',0.99 , 'FontSize',12 , 'SizeAxesNum',12 , 'ylimiy',[-3 3] , 'xlimix',[1 30] , 'lwdenv',4 , 'MarkerSize',4 , 'MarkerFaceColor','b' , 'tag','indexPlot' , 'databrush',1 , 'nameY',{'Y_1' Y_2'} , 'label',{'UK' ... 'IT'} 

h —Where to plot.axis hadle.

The axis handle of the Figure where to send the malindexplot. This can be used to host the malindexplot in a subplot of a complex figure formed by different panels (e.g. a panel with malindexplot from a classical mle estimator and another with Mahalanobis distances from a robust analysis, see example below).

Example:  'h',gca 

Data Types: graphics handle

x —x-axis index.vector.

The vector to be plotted on the x-axis.

Default is the sequence 1:length(md).

Example:  'x','1:100' 

Data Types: numeric

labx —x label.character.

A label for the x-axis (default: '').

Example:  'labx','unit number' 

Data Types: character

laby —y label.character.

A label for the y-axis (default: '').

Example:  'laby','MD' 

Data Types: character

title —plot title.character.

A label containing the title of the plot.

Default is 'Index plot of Mahalanobid distances'.

Example:  'title','Index plot of MD' 

Data Types: character

numlab —number of points to be labelled in the plot.vector | cell.

If numlab is a cell containing scalar k, the units with the k largest md are labelled in the plots.

If numlab is a vector, the units indexed by the vector are labelled in the plot.

Default is numlab={5}, that is units with the 5 largest md are labelled.

Use numlab='' for no labelling.

Example:  'numlab',{3} 

Data Types: numeric vector or cell.

conflev —confidence interval for the horizontal bands.vector.

It can be a vector of different confidence level values, e.g. [0.95,0.99,0.999]. Confidence interval is based on the chi^2 distribution.

Example:  'conflev',0.99 

Data Types: numeric

FontSize —Labels font size.scalar.

Scalar which controls the font size of the labels of the axes.

Default value is 12.

Example:  'FontSize',12 

Data Types: numeric

SizeAxesNum —Numbers font size.scalar.

Scalar which controls the fontsize of the numbers of the axes.

Default value is 10.

Example:  'SizeAxesNum',12 

Data Types: numeric

ylimy —ylimits.vector.

Vector with two elements controlling minimum and maximum value of the y axis.

Default is '' (automatic scale).

Example:  'ylimiy',[-3 3] 

Data Types: numeric

xlimx —xlimits.vector.

Vector with two elements controlling minimum and maximum value of the x axis.

Default is '' (automatic scale).

Example:  'xlimix',[1 30] 

Data Types: numeric

lwdenv —Envelope line width.scalar.

Scalar which controls the width of the lines associated with the envelopes.

Default is lwdenv=1.

Example:  'lwdenv',4 

Data Types: numeric

MarkerSize —Marker size of points.scalar.

Scalar specifying the size of the marker in points (1 point = 1/72 inch).

Default is MarkerSize = 6.

Example:  'MarkerSize',4 

Data Types: numeric

MarkerFaceColor —Marker fill color of points.character | length 3 RGB numeric vector.

The fill color for markers that are closed shapes (circle, square, diamond, pentagram, hexagram, and the four triangles).

Example:  'MarkerFaceColor','b' 

Data Types: numeric | character

tag —Figure tag.character.

Tag of the figure which will host the malindexplot.

The default tag is pl_malindex.

Example:  'tag','indexPlot' 

Data Types: character

databrush —interactive mouse brushing.empty value, scalar | structure.

If databrush is an empty value (default), no brushing is done. The activation of this option (databrush is a scalar or a structure) enables the user to select a set the points in the current plot and to see them highlighted in the scatter plot matrix (spm). If spm does not exist it is automatically created.

DATABRUSH IS A SCALAR.

If databrush is a scalar the default selection tool is a rectangular brush and it is possible to brush only once (that is persist='').

DATABRUSH IS A STRUCTURE.

If databrush is a structure, it is possible to use all optional arguments of function selectdataFS and the following optional arguments:

databrush.persist = persisent brushing.

Persist is an empty value or a scalar containing the strings 'on' or 'off'.

The default value of persist is '', that is brushing is allowed only once.

If persist is 'on' or 'off' brushing can be done as many time as the user requires.

If persist='on' then the unit(s) currently brushed are added to those previously brushed. it is possible, every time a new brushing is done, to use a different color for the brushed units.

If persist='off' every time a new brush is performed units previously brushed are removed.

databrush.labeladd = add labels. If this option is '1', we label in the scatter plot matrix the units of the last selected group with the unit row index in matrix Y. The default value is labeladd='', i.e. no label is added.

REMARK: the options which follow work in connection with previous option databrush and produce their effect on the scatter plot matrix of the original data.

Example:  'databrush',1 

Data Types: single | double | struct

nameY —variables labels of the original data matrix.cell.

Cell array of strings containing the labels of the variables. As default value, the labels which are added are Y1, ..., Yv. This option is used just if previous option databrush is not empty.

Example:  'nameY',{'Y_1' Y_2'} 

Data Types: character

label —row labels.cell.

Cell of length n containing the labels of the rows.

Example:  'label',{'UK' ... 'IT'} 

Data Types: cell

Output Arguments

MCDenv —Empirical envelopes. Array

Matrix with size n-by-length(conflev) which contains the empirical confidence envelopes or vector of length length(conflev) containing teh quantiles of the reference distribution.

References

Rousseeuw P.J., Leroy A.M. (1987), "Robust regression and outlier detection", Wiley.

See Also

This page has been automatically generated by our routine publishFS