Mlocsca

Mlocsca computes M estimator of location and scale in univariate samples

expand all in page

Syntax

outIRWLS=Mlocsca(y,psifunc)example
outIRWLS=Mlocsca(y,psifunc,initialmu)example
outIRWLS=Mlocsca(y,psifunc,initialmu,initialscale)example
outIRWLS=Mlocsca(y,psifunc,initialmu,initialscale,tol)example
outIRWLS=Mlocsca(y,psifunc,initialmu,initialscale,tol,maxiter)example

Description

This routines compute simultaneous M estimates of location and scale in univariate samples.

example

outIRWLS =Mlocsca(y, psifunc) Example of use of Mlocsca with two input arguments.

example

outIRWLS =Mlocsca(y, psifunc, initialmu) Example of use of Mlocsca with three input arguments.

example

outIRWLS =Mlocsca(y, psifunc, initialmu, initialscale) Example of use of Mlocsca with four input arguments.

example

outIRWLS =Mlocsca(y, psifunc, initialmu, initialscale, tol) Example of use of Mlocsca with five input arguments.

example

outIRWLS =Mlocsca(y, psifunc, initialmu, initialscale, tol, maxiter) Example of use of Mlocsca with six input arguments.

Examples

expand all

Example of use of Mlocsca with two input arguments.

Example of M estimate of location and scale using TB function.

psifuncTB=struct;
psifuncTB.class='TB';
bdp=0.5;
c=TBbdp(bdp,1);
% kc = E(rho) = sup(rho)*bdp
kc=c^2/6*bdp;
psifuncTB.c1=c;
psifuncTB.kc1=kc;
% trueloc and truescale are the true values of the two parameters
trueloc=5;
truescale=2;
% fraction of units to contaminate
fraccontamination=0.3;
% shift in contamination
shift=100;
% n is sample size
n=2000;
rng(1000)
u=truescale*randn(n,1)+trueloc;
n1=round(n*fraccontamination);
% shift contamination
u(1:n1)=u(1:n1)+shift;
% Specify initial estimate of location.
initialmu=10;
out=Mlocsca(u,psifuncTB,initialmu);
disp('Robust estimate of location')
disp(out.location)
disp('Robust estimate of scale')
disp(out.scale)

Example of use of Mlocsca with three input arguments.

M estimate of location ans scale using Hampel rho function with a value of c associated to a breakdown point of 0.5

psifunc=struct;
psifunc.class='HA';
abc=[1.5 3.5 8];
bdp=0.5;
c=HAbdp(bdp,1,abc);
psifunc.c1=[c abc];
% kc = E(rho) = sup(rho)*bdp
kc=HArho(c*abc(3),[c, abc])*bdp;
psifunc.c1=[c, abc];
psifunc.kc1=kc;
% trueloc and truescale are the true values of the two parameters
trueloc=5;
truescale=2;
% n is sample size
n=20000;
u=truescale*randn(n,1)+trueloc;
% Just 2 outliers with point mass contamination
u(1:2)=10000000;
out=Mlocsca(u,psifunc);
disp('Robust estimate of location')
disp(out.location)
disp('Robust estimate of scale')
disp(out.scale)

Example of use of Mlocsca with four input arguments.

M estimate of location and scale using optimal rho function with a value of c associated to a breakdown point of 0.3

psifuncOPT=struct;
psifuncOPT.class='OPT';
bdp=0.3;
cOPT=OPTbdp(bdp,1);
rhoOPTsup=OPTrho(200000,1);
% rhoOPTsup=1;
psifuncOPT.c1=cOPT;
% kc = E(rho) = sup(rho)*bdp
psifuncOPT.kc1=rhoOPTsup*bdp;
% trueloc and truescale are the true values of the two parameters
trueloc=5;
truescale=2;
% n is sample size
n=200;
u=truescale*randn(n,1)+trueloc;
% Just 10 outliers with point mass contamination
u(1:10)=10000000;
% 10 and 8 are our initial estimates of location and scale
out=Mlocsca(u,psifuncOPT,10,8);
disp('Robust estimate of location')
disp(out.location)
disp('Robust estimate of scale')
disp(out.scale)

Example of use of Mlocsca with five input arguments.

M estimate of location ans scale using power divergence rho function with a value of c associated to a breakdown point of 0.2

psifuncPD=struct;
psifuncPD.class='PD';
bdp=0.2;
c1=PDbdp(bdp);
psifuncPD.c1=c1;
psifuncPD.kc1=bdp;
% trueloc and truescale are the true values of the two parameters
trueloc=5;
truescale=2;
% n is sample size
n=200;
u=truescale*randn(n,1)+trueloc;
% Just 10 outliers with point mass contamination
u(1:20)=10000000;
% 10 and 8 are our initial estimates of location and scale
% Set the tolerance
tol=1e-20;
outIRWLS=Mlocsca(u,psifuncPD,10,8,tol);
disp('Robust estimate of location')
disp(outIRWLS.location)
disp('Robust estimate of scale')
disp(outIRWLS.scale)

Robust estimate of location
          5.07

Robust estimate of scale
          3.16

Example of use of Mlocsca with six input arguments.

M estimate of location and scale using hyperbolic rho function with a value of c associated to a breakdown point of 0.11

psifuncHYP=struct;
psifuncHYP.class='HYP';
bdp=0.11;
[cHYP,A,B,d]=HYPbdp(bdp,1);
k=6;
rhoHYPsup=HYPrho(200000,[cHYP,k,A,B,d]);
% rhoHAsup=1;
psifuncHYP.c1=[cHYP,k,A,B,d];
% kc = E(rho) = sup(rho)*bdp
psifuncHYP.kc1=rhoHYPsup*bdp;
% trueloc and truescale are the true values of the two parameters
trueloc=5;
truescale=2;
% n is sample size
n=200;
u=truescale*randn(n,1)+trueloc;
% Just 10 outliers with point mass contamination
u(1:20)=10000000;
% 10 and 8 are our initial estimates of location and scale
% Set the tolerance
tol=1e-20;
% Set maximum number of iterations
maxiter=500;
out=Mlocsca(u,psifuncHYP,10,8,tol,maxiter);
disp('Robust estimate of location')
disp(out.location)
disp('Robust estimate of scale')
disp(out.scale)

Input Arguments

expand all

`y` — A vector with n elements which contains the response variable. It can be both a row or column vector.

Data Types: single| double

`psifunc` — a structure specifying the class of rho function to use, the consistency factor, and the value associated with the Expectation of rho in correspondence of the consistency factor. struct.

Psifunc is a struct which contains the following fields

Value Description

Value	Description
`class`	string identyfing the rho (psi) function to use. Admissible values for class are 'bisquare', 'optimal' 'hyperbolic' and 'hampel'
`c1`	consistency factor associated to required breakdown point More precisely, psifunc.c1(1) contains consistency factor associated to required breakdown point or nominal efficiency psifunc.c1(2:end) may contain other parameters associated with the rho (psi) function. For example, if psifunc.class is 'hampel', c1(2:4) must contain parameters (a, b and c) of Hampel rho function. If length(psifunc.c1)=1 psifunc.class is HA, then default parameters for Hampel rho function are used. Remark: if class is 'hyperbolic' it is also possible to specify parameters k (sup CVC), A, B and d
`kc1`	Expectation of rho associated with c1

class

string identyfing the rho (psi) function to use.

Admissible values for class are 'bisquare', 'optimal' 'hyperbolic' and 'hampel'

c1

consistency factor associated to required breakdown point More precisely, psifunc.c1(1) contains consistency factor associated to required breakdown point or nominal efficiency psifunc.c1(2:end) may contain other parameters associated with the rho (psi) function.

For example, if psifunc.class is 'hampel', c1(2:4) must contain parameters (a, b and c) of Hampel rho function.

If length(psifunc.c1)=1 psifunc.class is HA, then default parameters for Hampel rho function are used.

Remark: if class is 'hyperbolic' it is also possible to specify parameters k (sup CVC), A, B and d

kc1

Expectation of rho associated with c1

Data Types: struct

Optional Arguments

`initialmu` — starting value of the location estimate. Scalar.

The initial estimate of location to use in the first iteration.

If not defined, initialmu is set equal to median(y).

Example: 0.34

Data Types: double

`initialscale` — starting value of the dispersion estimate. Scalar.

The initial estimate of the scale to use in the first iteration.

If not defined, initialscale is set equal to 1.4826*mad(y,1).

Example: 1.32

Data Types: double

`tol` — scalar. The tolerance for controlling convergence.

If not defined, tol is fixed to 1e-7.

Example: 1e-10

Data Types: double

`maxiter` — scalar. Maximum number of iterations to find the location estimate.

If not defined, maxiter is fixed to 200.

Example: 100

Data Types: double

Output Arguments

expand all

`outIRWLS` — description Structure

A structure containing the following fields:

Value	Description
`location`	Location estimate. Estimate of location after refsteps refining steps
`scale`	scale estimate. Estimate of scale after refsteps refining step
`weights`	n-by-1 vector. Weights assigned to each observation

More About

expand all

Additional Details

In the IRWLS (iterative reweighted least square) procedure the value of location and the value of the scale are updated in each step. In order to find simultaneous estimates of location and dispersion we need to solve the system of two equations $\sum_{i=1}^n \psi_\text{loc}\left( \frac{y_i - \hat{\mu}}{\hat{\sigma}}\right) = 0$ $\sum_{i=1}^n \rho_\text{scale}\left( \frac{y_i - \hat{\mu}}{\hat{\sigma}}\right) = K.$

In the two equations above we distinguish between $\rho_\text{scale}$ used for scale estimation and $\psi_\text{loc}$ and its derivatives used for location. The two need not be different and are, indeed, often the same. In this routine we assume that they are the same and are specified in input parameter psifunc.

$K$ corresponds to input parameter psifunc.kc1.

Given starting values $\hat{\mu}_0$ and $\hat{\sigma}_0$ the pair of reweighting equations moves forward from stage $k$ using the calculations

[1] Find the location weights

$w_{i,k} = w\{(y_i - \hat{\mu}_k)/\hat{\sigma}_k\}$ , Note that in order to find the weights we need psifunc.c1, the tuning constant associated to a nominal value of breakdown point or efficiency.

[2] Calculate the new location estimate as

$\hat{\mu}_{k+1} = \sum_{i=1}^n w_{i,k}y_i / \sum_{i=1}^n w_{i,k}.$ The new (squared) scale estimate is

[3]

$\hat{\sigma}^2_{k+1} = \hat{\sigma}^2_k\{{1}/(n K)\}\sum_{i=1}^n \rho_\text{scale}\{(y_i - \hat{\mu}_{k+1})/\hat{\sigma}_k\}.$

Note that in order to compute $\rho_\text{scale}$ we need psifunc.c1, the tuning constant associated to a nominal value of breakdown point or efficiency. $K$ corresponds to input parameter psifunc.kc1.

[4] Return to Step 1 until the change in the estimates defined as

$|\mu_{k+1}-\mu_{k}|/|\mu_{k}| + |\sigma_{k+1}-\sigma_{k}|/\sigma_{k}$

is less than a prespecified tolerance (optional input argument tol) or $k$ is equal to optional input parameter maxiter (maximum number of iterations).

This alternating algorithm converges to a point with zero derivatives, which may be a minimum, a maximum or a saddle point.

References

Maronna, R.A., Martin D. and Yohai V.J. (2006), "Robust Statistics, Theory and Methods", Wiley, New York.

Documentation

Mlocsca

Syntax

Description

Examples

Example of use of Mlocsca with two input arguments.

Example of use of Mlocsca with three input arguments.

Example of use of Mlocsca with four input arguments.

Example of use of Mlocsca with five input arguments.

Example of use of Mlocsca with six input arguments.

Input Arguments

`y` — A vector with n elements which contains the response variable. It can be both a row or column vector.

`psifunc` — a structure specifying the class of rho function to use, the consistency factor, and the value associated with the Expectation of rho in correspondence of the consistency factor. struct.

Optional Arguments

`initialmu` — starting value of the location estimate. Scalar.

`initialscale` — starting value of the dispersion estimate. Scalar.

`tol` — scalar. The tolerance for controlling convergence.

`maxiter` — scalar. Maximum number of iterations to find the location estimate.

Output Arguments

`outIRWLS` — description Structure

More About

Additional Details

References

See Also

Documentation

Mlocsca

Syntax

Description

Examples

Example of use of Mlocsca with two input arguments.

Example of use of Mlocsca with three input arguments.

Example of use of Mlocsca with four input arguments.

Example of use of Mlocsca with five input arguments.

Example of use of Mlocsca with six input arguments.

Input Arguments

y — A vector with n elements which contains the response variable. It can be both a row or column vector.

psifunc — a structure specifying the class of rho function to use, the consistency factor, and the value associated with the Expectation of rho in correspondence of the consistency factor. struct.

Optional Arguments

initialmu — starting value of the location estimate. Scalar.

initialscale — starting value of the dispersion estimate. Scalar.

tol — scalar. The tolerance for controlling convergence.

maxiter — scalar. Maximum number of iterations to find the location estimate.

Output Arguments

outIRWLS — description Structure

More About

Additional Details

References

See Also

`y` — A vector with n elements which contains the response variable. It can be both a row or column vector.

`psifunc` — a structure specifying the class of rho function to use, the consistency factor, and the value associated with the Expectation of rho in correspondence of the consistency factor. struct.

`initialmu` — starting value of the location estimate. Scalar.

`initialscale` — starting value of the dispersion estimate. Scalar.

`tol` — scalar. The tolerance for controlling convergence.

`maxiter` — scalar. Maximum number of iterations to find the location estimate.

`outIRWLS` — description Structure