FSCorAna

FSCorAna performs automatic outlier based on the forward search in correspondence analysis

Syntax

Description

example

out =FSCorAna(N) FSCorAna with all default options (input is output from mcdCorAna).

example

out =FSCorAna(N, Name, Value) FSCorAna with input contingency table.

Examples

expand all

  • FSCorAna with all default options (input is output from mcdCorAna).
  • Generate contingency table of size 50-by-5 with total sum of n_ij=2000.

    I=20;
    J=5;
    n=5000;
    % nrowt = column vector containing row marginal totals
    nrowt=(n/I)*ones(I,1);
    % ncolt = row vector containing column marginal totals
    ncolt=(n/J)*ones(1,J);
    out1=rcontFS(I,J,nrowt,ncolt);
    N=out1.m144;
    RAW=mcdCorAna(N,'plots',0);
    ini=round(sum(sum(RAW.N))/4);
    out=FSCorAna(RAW);

  • FSCorAna with input contingency table.
  • Generate contingency table of size 50-by-5 with total sum of n_ij=2000.

    I=50;
    J=5;
    n=2000;
    % nrowt = column vector containing row marginal totals
    nrowt=(n/I)*ones(I,1);
    % ncolt = row vector containing column marginal totals
    ncolt=(n/J)*ones(1,J);
    out1=rcontFS(I,J,nrowt,ncolt);
    N=out1.m144;
    out=FSCorAna(N,'plots',1);

    Related Examples

    expand all

  • Use pregenerated contingency tables to find envelopes for mmd.
  • load clothes.mat
    % Now FSCorAna uses the pregenerated tables coming from mcdCorAna.
    % Example of findEmpiricalEnvelope a struct
    findEmp=struct;
    % Generate nsimul contingency tables
    findEmp.nsimul=1000;
    % Under the null hypothesis of independence
    findEmp.underH0=true;
    % Store the nsimul robust distance sorted (for each row)
    findEmp.StoreSim=true;
    RAW=mcdCorAna(clothes,'plots',0,'findEmpiricalEnvelope',findEmp);
    out=FSCorAna(RAW);
    Total estimated time to complete MCD:  0.17 seconds 
    Finding empirical bands
    Creating empirical confidence band for minimum (weighted) Mahalanobis distance
    Warning: interchange greater than 1 when m=1921
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1923
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1925
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1927
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1929
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1931
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1933
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1935
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1152
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1169
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1201
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1155
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1820
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1823
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1826
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1829
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1832
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1394
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1168
    Number of units which entered=4
    Warning: interchange greater than 1 when m=1129
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1132
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1135
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1138
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1141
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1144
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1147
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1197
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1201
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1205
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1209
    Number of units which entered=4
    Warning: interchange greater than 1 when m=1213
    Number of units which entered=4
    Warning: interchange greater than 1 when m=1217
    Number of units which entered=4
    Warning: interchange greater than 1 when m=1221
    Number of units which entered=4
    Warning: interchange greater than 1 when m=1225
    Number of units which entered=4
    Warning: interchange greater than 1 when m=1229
    Number of units which entered=4
    Warning: interchange greater than 1 when m=1233
    Number of units which entered=4
    Warning: interchange greater than 1 when m=1211
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1215
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1219
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1223
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1872
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1275
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1107
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1095
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1097
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1099
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1101
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1103
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1105
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1107
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1109
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1111
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1113
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1115
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1117
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1119
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1121
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1123
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1125
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1127
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1129
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1131
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1133
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1135
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1137
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1139
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1141
    Number of units which entered=3
    Warning: interchange greater than 1 when m=2425
    Number of units which entered=3
    Warning: interchange greater than 1 when m=2429
    Number of units which entered=3
    Warning: interchange greater than 1 when m=2433
    Number of units which entered=3
    Warning: interchange greater than 1 when m=2437
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1384
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1804
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1806
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1808
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1810
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1812
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1814
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1816
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1818
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1820
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1822
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1824
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1826
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1828
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1830
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1832
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1834
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1836
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1838
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1840
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1842
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1844
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1846
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1848
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1850
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1854
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1858
    Number of units which entered=4
    Warning: interchange greater than 1 when m=1862
    Number of units which entered=4
    Warning: interchange greater than 1 when m=1866
    Number of units which entered=4
    Warning: interchange greater than 1 when m=1870
    Number of units which entered=4
    Warning: interchange greater than 1 when m=1874
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1878
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1882
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1886
    Number of units which entered=3
    Warning: interchange greater than 1 when m=1124
    Number of units which entered=3
    
    Click here for the graphical output of this example (link to Ro.S.A. website)

    Input Arguments

    expand all

    N — contingency table or structure. Array or table of size I-by-J or structure.

    If N is a structure it contains the following fields:

    Value Description
    N

    contingency table in array format of size I-by-J.

    Ntable

    this field is not compulsory where Ntable is a table or a timetable. If this field is present the label of the rows which are used are taken from RAW.Ntable.Properties.RowTimes (in presence of a timetable) RAW.Ntable.Properties.RowNames (in presence of a table).

    loc

    initial location estimate for the matrix of Profile rows of the contingency table (row vector or length J).

    weights

    I x 1 vector containing the proportion of the mass of each rows of matrix N in the computation of the MCD estimate of location. If N.weigths(2)=0.1 it means that row 2 of the contingency table contributes with 10 per cent of its mass. The initial subset is based on N.weights.

    NsimStore

    array of size I-by-J times nsimul containing in each column the nsimul simulated contingency tables.

    simulateUnderH0

    boolean. If it is true the simulated contingency tables have been specified under H0.

    Note that input structure N can be conveniently created by function mcdCorAna.

    If N is not a struct it is possible to specify the rows of the contingency table forming initial subset with input option bsb.

    Data Types: single|double

    Name-Value Pair Arguments

    Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

    Example: 'bsb',[3 6 8 10 12 14] , 'conflev',0.99 , 'init',50 , 'mmdEnv',[] , 'plots',0 , 'msg',false , 'resc',false , 'label',{'UK' ... 'IT'} , 'plots',0 , 'addRowNames',false

    bsb —Initial subset.vector of positive integers containing the indexes of the rows of the contingency table which have to be used to initialize the forward search.

    If bsb is empty and required input argument is a struct N.loc will be used.

    If bsb is supplied and N is a struct N.loc is ignored.

    The default value of bsb is empty, and if N is not a struct a random subset containing round(n/5) units will be used.

    Example: 'bsb',[3 6 8 10 12 14]

    Data Types: double

    conflev —simultaneous confidence interval to declare units as outliers.scalar inside (0, 1).

    The default value of conflev is 0.99, that is a 99 per cent simultaneous confidence level.

    Confidence level are based on simulated contingency tables. This input argument is ignored if optional input argument mmdEnv is not missing

    Example: 'conflev',0.99

    Data Types: numeric

    init —Point where to start monitoring required diagnostics.scalar.

    Note that if init is not specified it will be set equal to floor(n*0.6).

    where the total number of units in the contingency table.

    Example: 'init',50

    Data Types: double

    mmdEnv —Matrix which contains the precalculated empirical envelopes of minimum Mahalanobis distance.matrix | scalar missing value (default).

    If this optional input argument is not missing the empirical envelopes are taken from this optional argument and are not calculated. First column is subset size Second column is 1 per cent simultaneous empirical envelope. Third column is 50 per cent simultaneous empirical envelope. Fourth column is conflev per cent simultaneous empirical envelope which is used to detect the outliers. The default value of mmdStoreSim is a missing value, that is the envelopes are based on the N.NsimStore pregenerated contingency tables or if N.NsimStore is not present are generated assuming independence between rows and columns

    Example: 'mmdEnv',[]

    Data Types: double

    StoreSim —Store minimum Mahalanobis distance quantiles.boolean.

    Boolean which specifies whether to store or not as fields named mmdStore the simulated envelopes of the minimum Mahalanobis distance monitored along the search.

    Example: 'plots',0

    Data Types: double

    msg —It controls whether to display or not messages about envelope creation on the screen.logical.

    If msg==1 (default) messages are displayed on the screen else no message is displayed on the screen.

    Example: 'msg',false

    Data Types: logical

    resc —Rescale or not the envelopes.boolean.

    It controls whether to rescale or not the envelopes of min MD when if in the initial part of the search is steadily above or below the 5 and 95 per cent confidence bands. The default value of resc is true.

    Example: 'resc',false

    Data Types: logical

    label —row labels.cell | vector of strings.

    Cell or vector of strings of length n containing the labels of the rows. If input is a table or a timetable the row labels are automatically taken from the row names.

    Example: 'label',{'UK' ... 'IT'}

    Data Types: cell or characters or vector of strings

    plots —It specify whether it is necessary to produce the plots of the monitoring of minMD.scalar.

    If plots=1, a plot of the monitoring of minMD among the units not belonging to the subset is produced on the screen with 1 per cent, 50 per cent and 99 per cent confidence bands else (default), all plots are suppressed.

    Example: 'plots',0

    Data Types: double

    addRowNames —add or not names of the rows to the plot of min MD.boolean.

    If this option is equal to true (default) the first time a row is included in the subset is shown in the plot with the corresponding row label or row number.

    Example: 'addRowNames',false

    Data Types: logical

    Output Arguments

    expand all

    out — description Structure

    Structure which contains the following fields

    Value Description
    outliers

    k x 1 vector containing the list of the units declared as outliers or NaN if the sample is homogeneous

    mmd

    n-init-by-5 matrix which contains the monitoring of minimum MD at each step of the forward search.

    1st col = fwd search index (from init to n-1);

    2nd col = minimum MD weighted by row mass;

    3rd col = 1 per cent envelope;

    4th col = 50 per cent envelope;

    5th col = conflev per cent envelope;

    ine

    n-init-by-2 matrix which contains the monitoring of inertia at each step of the forward search.

    1st col = fwd search index (from init to n);

    2nd col = inertia;

    Un

    (n-init) x 11 Matrix which contains the unit(s) included in the subset at each step of the fwd search.

    REMARK: in every step the new subset is compared with the old subset. Un contains the unit(s) present in the new subset but not in the old one Un(1,2) for example contains the unit included in step init+1 Un(end,2) contains the units included in the final step of the search

    N

    Original contingency table, in array format.

    loc

    1 x v vector containing location of the data.

    md

    n x 1 vector containing the estimates of the robust Mahalanobis distances (in squared units) mutiplied by the row masses.

    This vector contains the distances of each observation from the location of the data, relative to the scatter matrix cov.

    thresh

    threshold for minMD with which outliers have been declared

    conflev

    simultaneous confidence level which has been used to declare the outliers.

    simulateUnderH0

    boolean. If it is true the simulated contingency tables have been specified under H0.

    nsimul

    number of simulations which have been used to create the envelopes. This information is taken from size(N.NsimStore,2)

    Y

    array of size I-by-J containing row profiles.

    class

    'FSCorAna'

    References

    Atkinson, A.C., Riani, M. and Cerioli, A. (2004), "Exploring multivariate data with the forward search", Springer Verlag, New York.

    This page has been automatically generated by our routine publishFS