# regressHart

regressHart fits a multiple linear regression model using ART heteroskedasticity

## Syntax

• out=regressHart(y,X,Z)example
• out=regressHart(y,X,Z,Name,Value)example

## Description

 out =regressHart(y, X, Z) regressHart with all default options.

 out =regressHart(y, X, Z, Name, Value) regressHart with optional arguments.

## Examples

expand all

### regressHart with all default options.

The data in Appendix Table F6.1 were used in a study of efficiency in production of airline services in Greene (2007a). See p. 557 of Green (7th edition).

% Common part to all examples: load TableF61_Greene dataset.
Y=TableF61_Greene{:,:};
Q=log(Y(:,4));
Pfuel=log(Y(:,5));
n=size(Y,1);
X=[Q Q.^2 Pfuel];
y=log(Y(:,3));
whichstats={'beta', 'r','tstat'};
OLS=regstats(y,X,'linear',whichstats);
disp('Ordinary Least Squares Estimates')
LSest=[OLS.tstat.beta OLS.tstat.se OLS.tstat.t OLS.tstat.pval];
disp(LSest)
out=regressHart(y,X,Loadfactor);
Ordinary Least Squares Estimates
9.1382    0.2451   37.2887    0.0000
0.9261    0.0323   28.6681    0.0000
0.0291    0.0123    2.3688    0.0201
0.4101    0.0188   21.8037    0.0000



### regressHart with optional arguments.

Estimate a multiplicative heteroscedastic model and print the estimates of regression and scedastic parameters together with LM, LR and Wald test

load('TableF61_Greene');
Y=TableF61_Greene{:,:};
Q=log(Y(:,4));
Pfuel=log(Y(:,5));
n=size(Y,1);
X=[Q Q.^2 Pfuel];
y=log(Y(:,3));
out=regressHart(y,X,Loadfactor,'msgiter',1,'test',1);

## Input Arguments

### y — Response variable. Vector.

A vector with n elements that contains the response variable.

It can be either a row or column vector.

Data Types: single| double

### X — Predictor variables in the regression equation. Matrix.

Data matrix of explanatory variables (also called 'regressors') of dimension (n x p-1). Rows of X represent observations, and columns represent variables.

By default, there is a constant term in the model, unless you explicitly remove it using option intercept, so do not include a column of 1s in X.

Data Types: single| double

### Z — Predictor variables in the skedastic equation. Matrix.

n x r matrix or vector of length r.

If Z is a n x r matrix it contains the r variables which form the scedastic function as follows:

$\omega_i = 1 + exp(\gamma_0 + \gamma_1 Z(i,1) + ...+ \gamma_{r} Z(i,r))$.

If Z is a vector of length r it contains the indexes of the columns of matrix X which form the scedastic function as follows:

$\omega_i = 1 + exp(\gamma_0 + \gamma_1 X(i,Z(1)) + ...+ \gamma_{r} X(i,Z(r)))$.

Therefore, if for example the explanatory variables responsible for heteroscedisticity are columns 3 and 5 of matrix X, it is possible to use both the sintax regressHart(y,X,X(:,[3 5])) or the sintax regressHart(y,X,[3 5]).

Remark: Missing values (NaN's) and infinite values (Inf's) are allowed, since observations (rows) with missing or infinite values will automatically be excluded from the computations.

Data Types: single| double

### Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as  Name1,Value1,...,NameN,ValueN.

Example:  'intercept',false , 'initialbeta',[3.6 8.1] , 'initialgamma',[0.6 2.8] , 'maxiter',8 , 'tol',0.0001 , 'msgiter',0 , 'nocheck',true 

### intercept —Indicator for constant term.true (default) | false.

Indicator for the constant term (intercept) in the fit, specified as the comma-separated pair consisting of 'Intercept' and either true to include or false to remove the constant term from the model.

Example:  'intercept',false 

Data Types: boolean

### initialbeta —initial estimate of beta.vector.

p x 1 vector. If initialbeta is not supplied (default) standard least squares is used to find initial estimate of beta

Example:  'initialbeta',[3.6 8.1] 

Data Types: double

### initialgamma —initial estimate of gamma.vector.

vector of length (r+1). If initialgamma is not supplied (default) initial estimate of gamma is nothing but the OLS estimate in a regression where the response is given by squared residuals and the regressors are specified in input object Z (this regression also contains a constant term).

Example:  'initialgamma',[0.6 2.8] 

Data Types: double

### maxiter —Maximum number of iterations to find model paramters.scalar.

If not defined, maxiter is fixed to 200. Remark: in order to obtain the FGLS estimator (two step estimator) it is enough to put maxiter=1.

Example:  'maxiter',8 

Data Types: double

### tol —The tolerance for controlling convergence.scalar.

If not defined, tol is fixed to 1e-8. Convergence is obtained if $||d_{old}-d_{new}||/||d_{new}||<1e-8$ where d is the vector of length p+r+1 which contains regression and scedastic coefficients $d=(\beta' \gamma')'$; while $d_{old}$ and $d_{new}$ are the values of d in iterations t and t+1 t=1,2,...,maxiter.

Example:  'tol',0.0001 

Data Types: double

### msgiter —Level of output to display.scalar.

If msgiter=1 it is possible to see the estimates of the regression and scedastic parameters together with their standard errors and the values of Wald, LM and Likelihood ratio test, and the value of the maximized loglikelihood. If msgiter>1 it is also possible to see monitor the estimates of the coefficients in each step of the iteration. If msgiter<1 nothing is displayed on the screen

Example:  'msgiter',0 

Data Types: double

### nocheck —Check input arguments.boolean.

If nocheck is equal to true no check is performed on matrix y and matrix X. Notice that y and X are left unchanged. In other words the additional column of ones for the intercept is not added. As default nocheck=false.

Example:  'nocheck',true 

Data Types: boolean

## Output Arguments

### out — description Structure

T consists of a structure 'out' containing the following fields

Value Description
Beta

p-by-3 matrix containing:

1st col = Estimates of regression coefficients;

2nd col = Standard errors of the estimates of regr coeff;

3rd col = t-tests of the estimates of regr coeff.

Gamma

(r+1)-by-3 matrix containing:

1st col = Estimates of scedastic coefficients;

2nd col = Standard errors of the estimates of scedastic coeff;

3rd col = t tests of the estimates of scedastic coeff.

sigma2

scalar. Estimate of $\sigma^2$ (sum of squares of residuals divided by n in the transformed scale)

WA

scalar. Wald test

LR

scalar. Likelihood ratio test

LM

scalar. Lagrange multiplier test

LogL

scalar. Complete maximized log likelihood DETAILS. This routine implements art heteroscedasticity

The model is:

$y=X*\beta+ \epsilon, \epsilon ~ N(0, \Sigma) = N(0, \sigma^2*\Omega)$;

$\Omega=diag(\omega_1, ..., \omega_n)$;

$\omega_i=1+exp(z_i^T*\gamma)$;

$\Sigma=diag(\sigma_1^2, ..., \sigma_n^2)=diag(\sigma^2*\omega_1, ..., \sigma^2*\omega_n)$;

$var(\epsilon_i)=\sigma^2_i = \sigma^2 \omega_i \;\;\; i=1, ..., n$.

$\beta$ = vector which contains regression parameters;

$\gamma$= vector which contains skedastic parameters.

REMARK 1: if $Z=log(X)$ then $1+exp(z_i^T*\gamma) = 1+exp(\gamma_1)* \prod x_{ij}^{\gamma_j} \;\; j=1, ..., p-1$.

REMARK2: if there is just one explanatory variable (say x) which is responsible for heteroskedasticity and the model is $\sigma_i=\sigma_2(1+ \theta*x_i^\alpha)$ then it is necessary to to supply Z as $Z=log(x)$. In this case, given that the program automatically adds a column of ones to Z:

$exp(Z_{1i}*\gamma_1 +Z_{2i}*\gamma_2)= exp(\gamma_1)*x_{1i}^{\gamma_2}$ therefore $exp(\gamma_1)$ is the estimate of $\theta$ while $\gamma_2$ is the estimate of $\alpha$

## References

Greene, W.H. (1987), "Econometric Analysis", Prentice Hall. [5th edition, section 11.7.1 pp. 232-235, 7th edition, section 9.7.1 pp. 280-282]

Atkinson, A.C., Riani, M. and Torti, F. (2016), Robust methods for heteroskedastic regression, "Computational Statistics and Data Analysis", Vol. 104, pp. 209-222, http://dx.doi.org/10.1016/j.csda.2016.07.002 [ART]