FSMeda performs forward search in multivariate analysis with exploratory data analysis purposes

Run the FS on a simulated dataset by choosing an initial subset formed by the three observations with the smallest Mahalanobis Distance.

n=100; v=3; m0=4; Y=randn(n,v); % Contaminated data Ycont=Y; Ycont(1:5,:)=Ycont(1:5,:)+3; [fre]=unibiv(Y); %create an initial subset with the 3 observations with the lowest %Mahalanobis Distance fre=sortrows(fre,4); bs=fre(1:m0,1); [out]=FSMeda(Ycont,bs);

Monitoring the evolution of minimum Mahlanobis distance.

n=100; v=3; m0=3; Y=randn(n,v); % Contaminated data Ycont=Y; Ycont(1:5,:)=Ycont(1:5,:)+3; [fre]=unibiv(Y); %create an initial subset with the 3 observations with the lowest %Mahalanobis Distance fre=sortrows(fre,4); bs=fre(1:m0,1); [out]=FSMeda(Ycont,bs,'plots',1);

load('emilia2001') Y=emilia2001.data; [fre]=unibiv(Y); %create an initial subset with the 30 observations with the lowest %Mahalanobis Distance fre=sortrows(fre,4); m0=30; bs=fre(1:m0,1); [out]=FSMeda(Y,bs,'init',100); % Minimum Mahalanobis distance % Compare the plot with Figure 1.12 p. 21, ARC (2004) mmdplot(out,'ylimy',[6 14]) % Analysis of the last 16 units to enter the forward search % Compare the results with Table 1.3 p. 21 disp(out.Un(end-15:end,:));

load('emilia2001') Y=emilia2001.data; % Replace zeros with min values for variables specified in sel sel=[6 10 12 13 19 21]; for i=sel Y(Y(:,i)==0,i)=min(Y(Y(:,i)>0,i)); end % Modify variables y16 y23 y25 y26 sel=[16 23 25 26]; sel=[25 26]; Y1=Y; Y1(:,sel)=100-Y1(:,sel); la0demo=[0.5,0.25,0,1,0.25,0,0,0.25,0.5]; la0weal=[0.25,0.5,0.5,1,1,0.5,-1/3,0.25,0.25,-1]; la0work=[0.25,0,1,0,0,0.25,1,1,1]; la0C2=[la0demo(1:5) la0work(1:4) la0demo(6:9) la0weal la0work(5:9)]; Y1tr=normBoxCox(Y1,1:28,la0C2); [fre]=unibiv(Y1tr); %create an initial subset with the 30 observations with the lowest %Mahalanobis Distance fre=sortrows(fre,4); m0=30; bs=fre(1:m0,1); [out]=FSMeda(Y1tr,bs,'init',100,'scaled',1); % Minimum Mahalanobis distance [out]=FSMeda(Y1tr,bs,'init',100); mmdplot(out,'ylimy',[5 26]) standard=struct; standard.ylim=[4 17]; malfwdplot(out,'standard',standard);

`Y`

— Input data.
Matrix.n x v data matrix; n observations and v variables. Rows of Y represent observations, and columns represent variables. Missing values (NaN's) and infinite values (Inf's) are allowed, since observations (rows) with missing or infinite values will automatically be excluded from the computations.

**
Data Types: **`single | double`

`bsb`

— Units forming subset.
Vector.List of units forming the initial subset.

If bsb=0 (default) then the procedure starts with v units randomly chosen else if bsb is not 0 the search will start with m0=length(bsb).

**
Data Types: **`single | double`

Specify optional comma-separated pairs of `Name,Value`

arguments.
`Name`

is the argument name and `Value`

is the corresponding value. `Name`

must appear
inside single quotes (`' '`

).
You can specify several name and value pair arguments in any order as ```
Name1,Value1,...,NameN,ValueN
```

.

```
'init',50
```

,```
'plots',0
```

,```
'msg',0
```

,```
'scaled',0
```

,```
'nocheck',1
```

`init`

—Point where to start monitoring required diagnostics.scalar.Note that if bsb is supplied, init>=length(bsb). If init is not specified it will be set equal to floor(n*0.6).

**Example: **```
'init',50
```

**Data Types: **`double`

`plots`

—It specify whether it is necessary to produce the plots of the
monitoring of minMD.scalar.If plots=1, a plot of the monitoring of minMD among the units not belonging to the subset is produced on the screen with 1 per cent, 50 per cent and 99 per cent confidence bands else (default), all plots are suppressed.

**Example: **```
'plots',0
```

**Data Types: **`double`

`msg`

—It controls whether to display or not messages
about great interchange on the screen.scalar.If msg==1 (default) messages are displyed on the screen else no message is displayed on the screen.

**Example: **```
'msg',0
```

**Data Types: **`double`

`scaled`

—It controls whether to monitor scaled Mahalanobis distances.scalar.If scaled=1 Mahalanobis distances monitored during the search are scaled using ratio of determinant.

If scaled=2 Mahalanobis distances monitored during the search are scaled using asymptotic consistency factor.

The default value is 0 that is Mahalanobis distances are not scaled.

**Example: **```
'scaled',0
```

**Data Types: **`double`

`nocheck`

—It controls whether to perform checks on matrix Y.scalar.If nocheck is equal to 1 no check is performed on matrix Y. As default nocheck=0.

**Example: **```
'nocheck',1
```

**Data Types: **`double`

`out`

— description
StructureStructure which contains the following fields

Value | Description |
---|---|

`MAL` |
n x (n-init+1) = matrix containing the monitoring of Mahalanobis distances. 1st row = distance for first unit; ...; nth row = distance for nth unit. |

`BB` |
n x (n-init+1) matrix containing the information about the units belonging to the subset at each step of the forward search. 1st col = indexes of the units forming subset in the initial step; ...; last column = units forming subset in the final step (all units). |

`mmd` |
n-init x 3 matrix which contains the monitoring of minimum MD or (m+1)th ordered MD at each step of the forward search. 1st col = fwd search index (from init to n-1); 2nd col = minimum MD; 3rd col = (m+1)th-ordered MD. |

`msr` |
n-init+1 x 3 = matrix which contains the monitoring of maximum MD or mth ordered MD. 1st col = fwd search index (from init to n); 2nd col = maximum MD; 3rd col = mth-ordered MD. |

`gap` |
n-init+1 x 3 = matrix which contains the monitoring of the gap (difference between minMD outside subset and max. inside). 1st col = fwd search index (from init to n); 2nd col = min MD - max MD; 3rd col = (m+1)th ordered MD - mth ordered distance. |

`Loc` |
(n-init+1) x (v+1) matrix containing the monitoring of estimated of the means for each variable in each step of the forward search. |

`S2cov` |
(n-init+1) x (v*(v+1)/2+1) matrix containing the monitoring of the elements of the covariance matrix in each step of the forward search. 1st col = fwd search index (from init to n); 2nd col = monitoring of S(1,1); 3rd col = monitoring of S(1,2); ...; end col = monitoring of S(v,v). |

`detS` |
(n-init+1) x (2) matrix containing the monitoring of the determinant of the covariance matrix in each step of the forward search. |

`Un` |
(n-init) x 11 Matrix which contains the unit(s) included in the subset at each step of the fwd search. REMARK: in every step the new subset is compared with the old subset. Un contains the unit(s) present in the new subset but not in the old one Un(1,2) for example contains the unit included in step init+1 Un(end,2) contains the units included in the final step of the search |

`Y` |
Original data input matrix |

`class` |
'FSMeda' |

Atkinson, A.C., Riani, M. and Cerioli, A. (2004), "Exploring multivariate data with the forward search", Springer Verlag, New York.