RobRegrSize

RobRegrSize provides proper threshold for robust estimators to obtain an empirical size close to 1 per cent nominal size

expand all in page

Syntax

thresh=RobRegrSize(n,p,robest,rhofunc,bdp,eff,sizesim,Tallis)example

Description

example

thresh =RobRegrSize(n, p, robest, rhofunc, bdp, eff, sizesim, Tallis) RobRgerSize with all default options.

Examples

expand all

RobRgerSize with all default options.

Find the threshold for MM estimator, Tukey biweight rho function with efficiency 0.87 (simultaneous size).

n=232;
p=10;
bdp='';
robest='MM';
eff=0.87;
rhofunc='TB';
sizesim=1;
thresh=RobRegrSize(n,p,robest,rhofunc,bdp,eff,sizesim);

Related Examples

expand all

Additional Example 1.

Find the threshold for MM estimator, take an average threshold for all rho functions, and use efficiency 0.85 (simultaneous size).

n=93;
p=5;
bdp='';
eff=0.85;
robest='MM';
rhofunc='ST';
sizesim=1;
thresh=RobRegrSize(n,p,robest,rhofunc,bdp,eff,sizesim);

Additional Example 2.

Find the threshold for LTS estimator, use Tallis correction to infer a threshold for bdp equal to 0.27 (simultaneous size).

n=72;
p=10;
bdp=0.27;
robest='LTS';
eff='';
rhofunc='';
sizesim=1;
Tallis=1;
thresh=RobRegrSize(n,p,robest,rhofunc,bdp,eff,sizesim,Tallis);

Additional Example 3.

Find the threshold for S estimator and hyperbolic rho function, use Tallis correction to infer a threshold for bdp equal to 0.3 (simultaneous size).

n=100;
p=5;
bdp=0.3;
robest='S';
eff='';
rhofunc='HY';
sizesim=1;
Tallis=1;
thresh=RobRegrSize(n,p,robest,rhofunc,bdp,eff,sizesim,Tallis);

Input Arguments

expand all

`n` — sample size. Scalar integer.

Number of units of the regression dataset.

REMARK - simulations have been done for n=50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 400, 500.

For other values of n, the threshold are found by interpolation using the two closest values smaller or greater than the one which has been considered.

`p` — number of variables. Scalar integer.

Number of explanatory variables.

REMARK - simulations have been done for p=2, 3, ..., 10. If the user supplies a value of p greater than 10 the correction factors are extrapolated by fitting a simple quadratic model in p.

`robest` — robust estimator. String.

String that identifies the robust estimator which is used Possible values are: 'S' S estimators;

'MM' MM estimators;

'LTS' Least trimmed squares estimator;

'LTSr' Least trimmed squares estimator reweighted.

If robest is missing, MM estimator is used.

Data Types: har

`rhofunc` — Weight function. String.

String which identifies the weight function which has been used for S or MM.

Possible values are: 'TB', for Tukey biweight rho function;

'HA', for Hampel rho function;

'HY', for hyperbolic rho function;

'OP', for optimal rho function;

'ST', Soft trimming estimator (in this case, an average threshold based on the TB,HY,HA and OP is used).

REMARK - this value is ignored if robest is LTS or LTSr.

If rhofunc is missing and robest is 'S' or 'MM', the default value of rhofunc is 'ST'.

Data Types: har

`bdp` — breakdown point. Scalar.

Scalar between 0 and 0.5. If robest is S, LTS or LTSr and bdp is missing a value of 0.5 is used as default.

REMARK - simulations have been done for bdp=0.25 and 0.50.

If the user supplies a value of bdp smaller than 0.25, the threshold found for bdp=0.25 is used. In this case, a warning is produced which alerts the user that the test is likely to be conservative. If, on the other hand, bdp is a value in the interval (0.25 0.5), an average between bdp=0.25 and bdp=0.5 is used (for a more refined correction, please see input option Tallis).

Data Types: single| double

`eff` — nominal efficiency. Scalar.

Scalar between between 0.5 and 1-epsilon (if robest is 'MM').

REMARK - simulations have been done for eff = 0.85, 0.90 and 0.95. If the user supplies a value of eff smaller than 0.85 (greater than 0.95), the threshold found for eff=0.85 (eff=0.95) is used. In all the other cases an average is taken using the two closest values of eff.

Data Types: single| double

`sizesim` — simultaneous or individual size. Scalar.

Scalar which specifies whether simultaneous (sizesim=1) or individual size is used. If sizesim is missing or equal to 1 a simultaneous size is used.

Data Types: single| double

`Tallis` — need to interpolate. Scalar.

Scalar which has an effect just if bdp is not equal to 0.25 or 0.5. If Tallis=1, the program computes the ratio between the asymptotic consistency factor using the breakdown point supplied by the user and the closest consistency factor associated to the breakdown point for which simulations exist. Therefore, if, for example, the supplied breakdown is smaller than 0.25, the program multiplies the empirical threshold using bdp=0.25 by a number smaller than 1.

Similarly, if bdp>0.375 the program multiplies the empirical threshold using bdp=0.5 by a number smaller than 1. If supplied, bdp is very close to 0.25 or 0.5, we suggest to use this option, otherwise it is better to take a simple average of the thresholds associated to the two closest breakdown points for which simulations exist. The default value of Tallis is 0.

Data Types: single| double

Output Arguments

expand all

`thresh` —Empirical threshold. Scalar

Empirical threshold which can be used in order to have a test with an empirical size close to the nominal size (1 individual or simultaneous).

More About

expand all

Additional Details

We assume that the two input MAT files Ind_ThreshSm.mat and Sim_ThreshSm.mat are in the same folder or in the MATLAB path.

Ind_ThreshSm.mat contains a 3D array with the thresholds in case an individual size is requested.

Sim_ThreshSm.mat contains a 3D array with the thresholds in case a simultaneous size is requested.

The two 3D arrays have dimension 12-by-9-by-24.

The 12 rows are referred to the 12 sample sizes which have been considered namely n=50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 400, 500.

The 9 columns are referred to the number of variables which have been considered namely p=2, 3, ..., 10.

The third dimension is associated with the 24 estimators that have been used. The order of the estimators is: ' 1' 'LTSbdp050' ;

' 2' 'LTSbdp025' ;

' 3' 'LTSrbdp050';

' 4' 'LTSrbdp025';

' 5' 'Sbdp025TB' ;

' 6' 'Sbdp050TB' ;

' 7' 'MMeff085TB';

' 8' 'MMeff090TB';

' 9' 'MMeff095TB';

'10' 'Sbdp025OP' ;

'11' 'Sbdp050OP' ;

'12' 'MMeff085OP';

'13' 'MMeff090OP';

'14' 'MMeff095OP';

'15' 'Sbdp025HY' ;

'16' 'Sbdp050HY' ;

'17' 'MMeff085HY';

'18' 'MMeff090HY';

'19' 'MMeff095HY';

'20' 'Sbdp025HA' ;

'21' 'Sbdp050HA' ;

'22' 'MMeff085HA';

'23' 'MMeff090HA';

'24' 'MMeff095HA'.

References

Salini S., Cerioli A., Laurini F. and Riani M. (2014), Reliable Robust Regression Diagnostics, "International Statistical Review", Vol. 84, pp. 99-127.

Documentation

RobRegrSize

Syntax

Description

Examples

RobRgerSize with all default options.

Related Examples

Additional Example 1.

Additional Example 2.

Additional Example 3.

Input Arguments

`n` — sample size. Scalar integer.

`p` — number of variables. Scalar integer.

`robest` — robust estimator. String.

`rhofunc` — Weight function. String.

`bdp` — breakdown point. Scalar.

`eff` — nominal efficiency. Scalar.

`sizesim` — simultaneous or individual size. Scalar.

`Tallis` — need to interpolate. Scalar.

Output Arguments

`thresh` —Empirical threshold. Scalar

More About

Additional Details

References

See Also

Documentation

RobRegrSize

Syntax

Description

Examples

RobRgerSize with all default options.

Related Examples

Additional Example 1.

Additional Example 2.

Additional Example 3.

Input Arguments

n — sample size. Scalar integer.

p — number of variables. Scalar integer.

robest — robust estimator. String.

rhofunc — Weight function. String.

bdp — breakdown point. Scalar.

eff — nominal efficiency. Scalar.

sizesim — simultaneous or individual size. Scalar.

Tallis — need to interpolate. Scalar.

Output Arguments

thresh —Empirical threshold. Scalar

More About

Additional Details

References

See Also

`n` — sample size. Scalar integer.

`p` — number of variables. Scalar integer.

`robest` — robust estimator. String.

`rhofunc` — Weight function. String.

`bdp` — breakdown point. Scalar.

`eff` — nominal efficiency. Scalar.

`sizesim` — simultaneous or individual size. Scalar.

`Tallis` — need to interpolate. Scalar.

`thresh` —Empirical threshold. Scalar