regressts computes estimates of regression parameters for a time series models
Load airline data.
% 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 y = [112 115 145 171 196 204 242 284 315 340 360 417 % Jan 118 126 150 180 196 188 233 277 301 318 342 391 % Feb 132 141 178 193 236 235 267 317 356 362 406 419 % Mar 129 135 163 181 235 227 269 313 348 348 396 461 % Apr 121 125 172 183 229 234 270 318 355 363 420 472 % May 135 149 178 218 243 264 315 374 422 435 472 535 % Jun 148 170 199 230 264 302 364 413 465 491 548 622 % Jul 148 170 199 242 272 293 347 405 467 505 559 606 % Aug 136 158 184 209 237 259 312 355 404 404 463 508 % Sep 119 133 162 191 211 229 274 306 347 359 407 461 % Oct 104 114 146 172 180 203 237 271 305 310 362 390 % Nov 118 140 166 194 201 229 278 306 336 337 405 432 ]; % Dec y=(y(:)); % linear trend + just one harmonic for seasonal out=regressts(y);
Model with linear trend, two harmonics for seasonal component and varying amplitude using a linear trend.
% Load airline data. % 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 y = [112 115 145 171 196 204 242 284 315 340 360 417 % Jan 118 126 150 180 196 188 233 277 301 318 342 391 % Feb 132 141 178 193 236 235 267 317 356 362 406 419 % Mar 129 135 163 181 235 227 269 313 348 348 396 461 % Apr 121 125 172 183 229 234 270 318 355 363 420 472 % May 135 149 178 218 243 264 315 374 422 435 472 535 % Jun 148 170 199 230 264 302 364 413 465 491 548 622 % Jul 148 170 199 242 272 293 347 405 467 505 559 606 % Aug 136 158 184 209 237 259 312 355 404 404 463 508 % Sep 119 133 162 191 211 229 274 306 347 359 407 461 % Oct 104 114 146 172 180 203 237 271 305 310 362 390 % Nov 118 140 166 194 201 229 278 306 336 337 405 432 ]; % Dec y=(y(:)); model=struct; model.trend=1; % linear trend model.s=12; % monthly time series model.seasonal=102; % two harmonics with time varying seasonality out=regressts(y,'model',model,'plots',1);
Model with linear trend, two harmonics for seasonal component and varying amplitude using a linear trend.
% Load airline data. % 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 y = [112 115 145 171 196 204 242 284 315 340 360 417 % Jan 118 126 150 180 196 188 233 277 301 318 342 391 % Feb 132 141 178 193 236 235 267 317 356 362 406 419 % Mar 129 135 163 181 235 227 269 313 348 348 396 461 % Apr 121 125 172 183 229 234 270 318 355 363 420 472 % May 135 149 178 218 243 264 315 374 422 435 472 535 % Jun 148 170 199 230 264 302 364 413 465 491 548 622 % Jul 148 170 199 242 272 293 347 405 467 505 559 606 % Aug 136 158 184 209 237 259 312 355 404 404 463 508 % Sep 119 133 162 191 211 229 274 306 347 359 407 461 % Oct 104 114 146 172 180 203 237 271 305 310 362 390 % Nov 118 140 166 194 201 229 278 306 336 337 405 432 ]; % Dec y=(y(:)); model=struct; model.trend=1; % linear trend model.s=12; % monthly time series model.seasonal=102; % two harmonics with time varying seasonality % Fit is based just on the first 40 observations out=regressts(y,'model',model,'bsb',[1:40],'dispresults', true, 'plots',1);
Load airline data.
% 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 y = [112 115 145 171 196 204 242 284 315 340 360 417 % Jan 118 126 150 180 196 188 233 277 301 318 342 391 % Feb 132 141 178 193 236 235 267 317 356 362 406 419 % Mar 129 135 163 181 235 227 269 313 348 348 396 461 % Apr 121 125 172 183 229 234 270 318 355 363 420 472 % May 135 149 178 218 243 264 315 374 422 435 472 535 % Jun 148 170 199 230 264 302 364 413 465 491 548 622 % Jul 148 170 199 242 272 293 347 405 467 505 559 606 % Aug 136 158 184 209 237 259 312 355 404 404 463 508 % Sep 119 133 162 191 211 229 274 306 347 359 407 461 % Oct 104 114 146 172 180 203 237 271 305 310 362 390 % Nov 118 140 166 194 201 229 278 306 336 337 405 432 ]; % Dec y=(y(:)); StartDate=[1949,1] % Imposed level shift in position t=60 and 4 harmonics model=struct; model.trend=1; % linear trend model.s=12; % monthly time series model.seasonal=104; % four harmonics with time varying seasonality model.posLS=60; % level shift in position t=60 out=regressts(y,'model',model,'plots',1,'StartDate',StartDate);
Example of the use of input option model and plots.
% Model with linear trend, two harmonics for seasonal component and % varying amplitude using a linear trend. % Load airline data. % 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 y = [112 115 145 171 196 204 242 284 315 340 360 417 % Jan 118 126 150 180 196 188 233 277 301 318 342 391 % Feb 132 141 178 193 236 235 267 317 356 362 406 419 % Mar 129 135 163 181 235 227 269 313 348 348 396 461 % Apr 121 125 172 183 229 234 270 318 355 363 420 472 % May 135 149 178 218 243 264 315 374 422 435 472 535 % Jun 148 170 199 230 264 302 364 413 465 491 548 622 % Jul 148 170 199 242 272 293 347 405 467 505 559 606 % Aug 136 158 184 209 237 259 312 355 404 404 463 508 % Sep 119 133 162 191 211 229 274 306 347 359 407 461 % Oct 104 114 146 172 180 203 237 271 305 310 362 390 % Nov 118 140 166 194 201 229 278 306 336 337 405 432 ]; % Dec y=(y(:)); model=struct; model.trend=1; % linear trend model.s=12; % monthly time series model.seasonal=104; % two harmonics with time varying seasonality bsini=1:72; out=regressts(y,'model',model,'bsb',bsini); outCR=regressts(y,'model',model,'bsb',bsini,'smallsamplecor',true,'asymptcor',true); h1=subplot(2,1,1); title('Estimate of the scale without correction factors') resindexplot(out.residuals,'h',h1) h2=subplot(2,1,2); resindexplot(outCR.residuals,'h',h2) title('Estimate of the scale with correction factors')
y
— Response variable.
Vector.Response variable, specified as a vector of length n, where n is the number of observations. Each entry in y is the response for the corresponding row of X.
Data Types: single| double
Specify optional comma-separated pairs of Name,Value
arguments.
Name
is the argument name and Value
is the corresponding value. Name
must appear
inside single quotes (' '
).
You can specify several name and value pair arguments in any order as
Name1,Value1,...,NameN,ValueN
.
'model', model
, 'bsb',[3 5 20:30]
, 'smallsamplecor',false
, 'asymptcor',false
, 'plots',1
, 'nocheck',false
, 'dispresults',true
, 'StartDate',[2016,3]
model
—model type.structure.A structure which specifies the model which will be used. The model structure contains the following fields:
Value | Description |
---|---|
s |
scalar (length of seasonal period). For monthly data s=12 (default), for quartely data s=4, ... |
trend |
scalar (order of the trend component). trend = 1 implies linear trend with intercept (default), trend = 2 implies quadratic trend ... Admissible values for trend are, 0, 1, 2 and 3. |
seasonal |
scalar (integer specifying number of frequencies, i.e. harmonics, in the seasonal component. Possible values for seasonal are $1, 2, ..., [s/2]$, where $[s/2]=floor(s/2)$. For example: if seasonal =1 (default) we have: $\beta_1 \cos( 2 \pi t/s) + \beta_2 sin ( 2 \pi t/s)$; if seasonal =2 we have: $\beta_1 \cos( 2 \pi t/s) + \beta_2 \sin ( 2 \pi t/s) + \beta_3 \cos(4 \pi t/s) + \beta_4 \sin (4 \pi t/s)$. Note that when $s$ is even the sine term disappears for $j=s/2$ and so the maximum number of trigonometric parameters is $s-1$. If seasonal is a number greater than 100 then it is possible to specify how the seasonal component grows over time. For example, seasonal =101 implies a seasonal component which just uses one frequency which grows linearly over time as follows: $(1+\beta_3 t)\times ( \beta_1 cos( 2 \pi t/s) + \beta_2 \sin ( 2 \pi t/s))$. For example, seasonal =201 implies a seasonal component which just uses one frequency which grows in a quadratic way over time as follows: $(1+\beta_3 t + \beta_4 t^2)\times( \beta_1 \cos( 2 \pi t/s) + \beta_2 \sin ( 2 \pi t/s))$. seasonal =0 implies a non seasonal model. |
X |
matrix of size T-by-nexpl containing the values of nexpl extra covariates which are likely to affect y. |
posLS |
positive integer which specifies to position to include the level shift component. For example if model.posLS =13 then the explanatory variable $I(t \geq 13})$ is created. If this field is not present or if it is empty, the level shift component is not included. |
B |
column vector or matrix containing the initial values of parameter estimates which have to be used in the maximization procedure. If model.B is a matrix, then initial estimates are extracted from the first colum of this matrix. If this field is empty or if this field is not present, the initial values to be used in the maximization procedure are referred to the OLS parameter estimates of the linear part of the model. The parameters associated to time varying amplitude are initially set to 0. Remark: the default model is for monthly data with a linear trend (2 parameters) + seasonal component with just one harmonic (2 parameters), no additional explanatory variables and no level shift that is model=struct; model.s=12; model.trend=1; model.seasonal=1; model.X=''; model.posLS=''; |
Example: 'model', model
Data Types: struct
bsb
—units forming subset.vector.m x 1 vector.
The default value of bsb is 1:length(y), that is all units are used to compute parameter estimates.
Example: 'bsb',[3 5 20:30]
Data Types: double
smallsamplecor
—small sample correction factor.boolean.Boolean which defines whether to use or not small sample correction factor to inflate the scale estimate just in case option bsb is used and length(bsb)<length(y). If it is true the small sample correction factor is used. The default value of smallsamplecor is 1, that is the correction is used. The default value of smallsamplecor is false.
Example: 'smallsamplecor',false
Data Types: logical
asymptcor
—asymptotic consistency factor.boolean.Boolean which defines whether to use or not consistency correction factor to inflate the scale estimate just in case option bsb is used and length(bsb)<length(y). If it is true the asmptotic consistency is used. The default value of asymptcor is false, that is the asymptotic consistency factor is not used.
Example: 'asymptcor',false
Data Types: logical
plots
—Plot on the screen.scalar.If equal to one a two panel plot appears on the scree. The top panel contains real and fitted value. The bottom panel contains scaled residuals with a 99.9 per cent confidence band, else (default) no plot is shown.
Remark: the plot which is produced is very simple. In order to control a series of options in this plot and in order to connect it dynamically to the other forward plots it is necessary to use function mdrplot.
Example: 'plots',1
Data Types: double
nocheck
—Check input arguments.boolean.If nocheck is equal to true no check is performed on supplied structure model
Example: 'nocheck',false
Data Types: logical
dispresults
—Display results of final fit.boolean.If dispresults is true, labels of coefficients, estimated coefficients, standard errors, tstat and p-values are shown on the screen in a fully formatted way. The default value of dispresults is false.
Example: 'dispresults',true
Data Types: logical
StartDate
—The time of the first observation.numeric vector of length 2.Vector with two integers, which specify a natural time unit and a (1-based) number of samples into the time unit. For example, if model.s=12 (that is the data are monthly) and the first observation starts in March 2016, then StartDate=[2016,3]; Similarly, if models.s=4 (that is the data are quarterly) and the first observation starts in the second quarter or year 2014, then StartData=[2014,2]. The information in option StartDate will be used to create in the output the dates inside the time series object.
Example: 'StartDate',[2016,3]
Data Types: double
out
— description
StructureA structure containing the following fields
Value | Description |
---|---|
B |
Matrix containing estimated beta coefficients, (including the intercept when options.intercept=true) standard errors, t-stat and p-values The content of matrix B is as follows: 1st col = beta coefficients The order of the beta coefficients is as follows: 1) trend elements (if present). If the trend is of order two there are r+1 coefficients if the intercept is present otherwise there are just r components; 2) linear part of seasonal component 2, 4, 6, ..., s-2, s-1 coefficients (if present); 3) coefficients associated with the matrix of explanatory variables which have a potential effect on the time series under study (X); 4) non linear part of seasonal component, that is varying amplitude. If varying amplitude is of order k there are k coefficients (if present); 5) level shift component (if present). In this case there are two coefficients, the second (which is also the last element of vector beta) is an integer which specifies the time in which level shift takes place and the first (which is also the penultime element of vector beta) is a real number which identifies the magnitude of the upward (downward) level shift; 2nd col = standard errors; 3rd col = t-statistics; 4th col = p values. |
yhat |
vector of fitted values after final (NLS=non linear least squares) step. $ (\hat \eta_1, \hat \eta_2, \ldots, \hat \eta_T)'$ |
e |
Vector T-by-1 containing the raw residuals. |
residuals |
Vector T-by-1 containing the scaled residuals. |
scale |
scale estimate of the residuals \[ \hat \sigma = cor \times \sum_{i \in S_m} [y_i- \eta(x_i,\hat \beta)]^2/(m-p) \] where $S_m$ is a set of cardinality $m$ which contains the units belonging to bsb and $p$ is the total number of estimated parameters and $cor$ is a correction factor to make the estimator consistent (see input options smallsamplecor and asymptcor). |
invXX |
$cov(\beta)/\hat \sigma^2$. p-by-p, square matrix. If the model is linear out.invXX is equal to $(X'X)^{-1}$, else out.invXX is equal to $(A'A)^{-1}$ where $A$ is the matrix of partial derivatives. More precisely: \[ a_{i,j}=\frac{\partial \eta_i(x_i, \hat \beta)}{\partial \hat \beta_j} \] where \begin{eqnarray} y_i & = & \eta(x_i,\beta)+ \epsilon_i \\ & = & \eta_i +\epsilon_i \\ & = & \eta(x_i,\hat \beta)+ e_i \\ & = & \hat \eta_i + e_i \end{eqnarray} |
covB |
$cov(\beta)$. p-by-p, square matrix containing variance covariance matrix of estimated coefficients. |
y |
response vector y. |
X |
data matrix X containing trend, seasonal, expl and lshift, if the model is linear or linearized version of $\eta(x_i, \beta)$ if the model is non linear containing in the columns partial derivatives evaluated in correspondence of out.B(:,1) with respect to each parameter. In other words, the $i,j$-th element of out.X is \[ \frac{\partial \eta_i(x_i, \hat \beta)}{\partial \hat \beta_j} \] $j=1, 2, \ldots, p$, $i \in S_m$. The size of this matrix is: n-length(out.outliers)-by-p. |
Exflag |
Reason nlinfit stopped. Integer. If the model is non linear out.Exflag is equal to 1 if the maximization procedure did not produce warnings or the warning was different from "ILL Conditiioned Jacobian". For any other warning which is produced (for example, "Overparameterized", "IterationLimitExceeded", 'MATLAB:rankDeficientMatrix") out.Exflag is equal to -1; |
ExflagALS |
Reason ALS routine stopped. Integer. If in the iterations missing or NaN are found out.ExflagALS=-1. Note that ALS routine is used just in case there was no convergence inside routine nlinfit in order to provide a better set of initial parameter estimates before retrying nlinfit. If there was immediate convergence in nlinfit out.ExflagALS is empty. |
iterALS |
Number of iterations in the ALS routine. Intger between 1 and 10000 (maximum number of iterations). Note that ALS routine is used just in case there was no convergence inside routine nlinfit in order to provide a better set of initial parameter estimates before retrying nlinfit. If there was immediate convergence in nlinfit out.iterALS is empty. |
Rousseeuw, P.J., Perrotta D., Riani M. and Hubert, M. (2018), Robust Monitoring of Many Time Series with Application to Fraud Detection, "Econometrics and Statistics". [RPRH]