malindexplot

malindexplot plots the Mahalanobis distances versus a selected variable.

Syntax

  • MCDenv=malindexplot(md,v)example
  • MCDenv=malindexplot(md,v,Name,Value)example

Description

example

MCDenv =malindexplot(md, v) Mahalanobis distance plot of 100 random numbers.

example

MCDenv =malindexplot(md, v, Name, Value) Compare traditional md with robust md for the stack loss data.

Examples

expand all

  • Mahalanobis distance plot of 100 random numbers.
  • Numbers are from from the chi2 with 5 degrees of freedom

    MCDenv=malindexplot(chi2rnd(5,100,1),5);

  • Compare traditional md with robust md for the stack loss data.
  • load('stack_loss.txt');
    X=stack_loss(:,1:3);
    [n,v]=size(X);
    % Define confidence level
    conflev=[0.95,0.99];
    figure;
    h1=subplot(2,1,1);
    % Compute traditional Mahalanobis distances
    mdtrad=mahal(X,X);
    malindexplot(mdtrad,v,'h',h1,'conflev',conflev,'labx','Index number','laby','Traditional md');
    % Compute robust md
    [out]=FSM(X,'init',5,'plots',0);
    seq=1:size(X,1);
    good=setdiff(seq,out.outliers);
    mdrob=mahal(X,X(good,:));
    h2=subplot(2,1,2);
    malindexplot(mdrob,v,'h',h2,'conflev',conflev,'labx','Index number','laby','Robust md','title','');

    Related Examples

    expand all

  • Interactive example 1. Index plot Mahalanobis distance with databrush option.
  • n=200;
    v=3;
    randn('state', 123456);
    Y=randn(n,v);
    % Contaminated data
    Ycont=Y;
    Ycont(1:5,:)=Ycont(1:5,:)+3;
    [RAW,REW]=mcd(Ycont);
    RAW.Y=Ycont;
    malindexplot(RAW,v,'databrush',1)

  • Interactive example 2. Index plot Mahalanobis distance with personalized databrush option.
  • n=200;
    v=3;
    randn('state', 123456);
    Y=randn(n,v);
    % Contaminated data
    Ycont=Y;
    Ycont(1:5,:)=Ycont(1:5,:)+3;
    [RAW,REW]=mcd(Ycont);
    RAW.Y=Ycont;
    databrush=struct;
    databrush.selectionmode='Brush'; % Brush selection
    databrush.persist='on'; % Enable repeated mouse selections
    databrush.Label='on'; % Write labels of the units while selecting
    databrush.RemoveLabels='on'; % Remove labels after selection
    databrush.RemoveTool    = 'off'; % Do not remove yellow tool after selection
    databrush.RemoveFlagged = 'off'; % Do not remove filled red color for selected points after selection
    databrush.labeladd = '1'; % Write number of seleceted units in the scatter plot matrix
    malindexplot(RAW,v,'databrush',databrush)

    Input Arguments

    expand all

    md — Mahalanobis distances. Vector or structure.

    Vector of Mahalanobis distances (in squared units) or a structure containing fields md and Y. In this second case md is a structure with the following fields:

    Value Description
    md

    contains the Mahalanobis distances (this field is compulsory);

    Y

    contains the original data matrix whose Mahalanobis distances have been computed (this field is compulsory is option databrush is used).

    class

    this field is not compulsory. In the case of md.class='mcdCorAna' simulated envelopes are used to define the empirical quantiles. Note that if the simulated bands have been precalculated they can be passed through the seconf input argument v.

    Data Types: single|double

    v — Number of variables or matrix of size n-by-k containing empirical envelope. Scalar or matrix with the same rows of length(md).

    If v is a scalar, it contains the number of variables of the original data matrix which have been used to compute md. The threshold in this case is based on the Chi^2 distribution with v degrees of freedom. If v is a matrix with size(v,1)=length(md) the empirical precalculated envelope in v are used to obtain the confidence bands.

    Data Types: single|double

    Name-Value Pair Arguments

    Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

    Example: 'h',gca , 'x','1:100' , 'labx','unit number' , 'laby','MD' , 'title','Index plot of MD' , 'numlab',{3} , 'conflev',0.99 , 'FontSize',12 , 'SizeAxesNum',12 , 'ylimiy',[-3 3] , 'xlimix',[1 30] , 'lwdenv',4 , 'MarkerSize',4 , 'MarkerFaceColor','b' , 'tag','indexPlot' , 'databrush',1 , 'nameY',{'Y_1' Y_2'} , 'label',{'UK' ... 'IT'}

    h —Where to plot.axis hadle.

    The axis handle of the Figure where to send the malindexplot. This can be used to host the malindexplot in a subplot of a complex figure formed by different panels (e.g. a panel with malindexplot from a classical mle estimator and another with Mahalanobis distances from a robust analysis, see example below).

    Example: 'h',gca

    Data Types: graphics handle

    x —x-axis index.vector.

    The vector to be plotted on the x-axis.

    Default is the sequence 1:length(md).

    Example: 'x','1:100'

    Data Types: numeric

    labx —x label.character.

    A label for the x-axis (default: '').

    Example: 'labx','unit number'

    Data Types: character

    laby —y label.character.

    A label for the y-axis (default: '').

    Example: 'laby','MD'

    Data Types: character

    title —plot title.character.

    A label containing the title of the plot.

    Default is 'Index plot of Mahalanobid distances'.

    Example: 'title','Index plot of MD'

    Data Types: character

    numlab —number of points to be labelled in the plot.vector | cell.

    If numlab is a cell containing scalar k, the units with the k largest md are labelled in the plots.

    If numlab is a vector, the units indexed by the vector are labelled in the plot.

    Default is numlab={5}, that is units with the 5 largest md are labelled.

    Use numlab='' for no labelling.

    Example: 'numlab',{3}

    Data Types: numeric vector or cell.

    conflev —confidence interval for the horizontal bands.vector.

    It can be a vector of different confidence level values, e.g. [0.95,0.99,0.999]. Confidence interval is based on the chi^2 distribution.

    Example: 'conflev',0.99

    Data Types: numeric

    FontSize —Labels font size.scalar.

    Scalar which controls the font size of the labels of the axes.

    Default value is 12.

    Example: 'FontSize',12

    Data Types: numeric

    SizeAxesNum —Numbers font size.scalar.

    Scalar which controls the fontsize of the numbers of the axes.

    Default value is 10.

    Example: 'SizeAxesNum',12

    Data Types: numeric

    ylimy —ylimits.vector.

    Vector with two elements controlling minimum and maximum value of the y axis.

    Default is '' (automatic scale).

    Example: 'ylimiy',[-3 3]

    Data Types: numeric

    xlimx —xlimits.vector.

    Vector with two elements controlling minimum and maximum value of the x axis.

    Default is '' (automatic scale).

    Example: 'xlimix',[1 30]

    Data Types: numeric

    lwdenv —Envelope line width.scalar.

    Scalar which controls the width of the lines associated with the envelopes.

    Default is lwdenv=1.

    Example: 'lwdenv',4

    Data Types: numeric

    MarkerSize —Marker size of points.scalar.

    Scalar specifying the size of the marker in points (1 point = 1/72 inch).

    Default is MarkerSize = 6.

    Example: 'MarkerSize',4

    Data Types: numeric

    MarkerFaceColor —Marker fill color of points.character | length 3 RGB numeric vector.

    The fill color for markers that are closed shapes (circle, square, diamond, pentagram, hexagram, and the four triangles).

    Example: 'MarkerFaceColor','b'

    Data Types: numeric | character

    tag —Figure tag.character.

    Tag of the figure which will host the malindexplot.

    The default tag is pl_malindex.

    Example: 'tag','indexPlot'

    Data Types: character

    databrush —interactive mouse brushing.empty value, scalar | structure.

    If databrush is an empty value (default), no brushing is done. The activation of this option (databrush is a scalar or a structure) enables the user to select a set the points in the current plot and to see them highlighted in the scatter plot matrix (spm). If spm does not exist it is automatically created.

    DATABRUSH IS A SCALAR.

    If databrush is a scalar the default selection tool is a rectangular brush and it is possible to brush only once (that is persist='').

    DATABRUSH IS A STRUCTURE.

    If databrush is a structure, it is possible to use all optional arguments of function selectdataFS and the following optional arguments:

    databrush.persist = persisent brushing.

    Persist is an empty value or a scalar containing the strings 'on' or 'off'.

    The default value of persist is '', that is brushing is allowed only once.

    If persist is 'on' or 'off' brushing can be done as many time as the user requires.

    If persist='on' then the unit(s) currently brushed are added to those previously brushed. it is possible, every time a new brushing is done, to use a different color for the brushed units.

    If persist='off' every time a new brush is performed units previously brushed are removed.

    databrush.labeladd = add labels. If this option is '1', we label in the scatter plot matrix the units of the last selected group with the unit row index in matrix Y. The default value is labeladd='', i.e. no label is added.

    REMARK: the options which follow work in connection with previous option databrush and produce their effect on the scatter plot matrix of the original data.

    Example: 'databrush',1

    Data Types: single | double | struct

    nameY —variables labels of the original data matrix.cell.

    Cell array of strings containing the labels of the variables. As default value, the labels which are added are Y1, ..., Yv. This option is used just if previous option databrush is not empty.

    Example: 'nameY',{'Y_1' Y_2'}

    Data Types: character

    label —row labels.cell.

    Cell of length n containing the labels of the rows.

    Example: 'label',{'UK' ... 'IT'}

    Data Types: cell

    Output Arguments

    expand all

    MCDenv —Empirical envelopes. Array

    Matrix with size n-by-length(conflev) which contains the empirical confidence envelopes or vector of length length(conflev) containing teh quantiles of the reference distribution.

    References

    Rousseeuw P.J., Leroy A.M. (1987), "Robust regression and outlier detection", Wiley.

    This page has been automatically generated by our routine publishFS