regressB

regressB computes Bayesian estimates of regression parameters

Syntax

  • out=regressB(y, X, beta0, R, tau0, n0)example
  • out=regressB(y, X, beta0, R, tau0, n0,Name,Value)example

Description

example

out =regressB(y, X, beta0, R, tau0, n0) regressB with all default options.

example

out =regressB(y, X, beta0, R, tau0, n0, Name, Value) regressB with optional arguments.

Examples

expand all

  • regressB with all default options.
  • Common part to all examples: definition of the prior information.

    rng('default')
    rng(100) % set seed for replicability
    p=3;
    n0=30;
    X0=[ones(n0,1) randn(n0,p-1)];
    R=X0'*X0;
    beta0=zeros(p,1);
    tau0=1;
    % SAMPLE INFORMATION
    n1=100;
    X=randn(n1,p-1);
    y=randn(n1,1);
    % Run regressB using all n1 data and use the default value of c=1
    out=regressB(y, X, beta0, R, tau0, n0);
    disp(out)
           beta1: [3×1 double]
              a1: 63.5000
              b1: 68.1135
            tau1: 0.9323
        covbeta1: [3×3 double]
         sigma21: 1.0898
             res: [100×2 double]
    
    

  • regressB with optional arguments.
  • Run regressB and compute new estimate of beta using just the first 20 observations and use a value of c equal to 1.2.

    rng('default')
    rng(100) % set seed for replicability
    p=3;
    n0=30;
    X0=[ones(n0,1) randn(n0,p-1)];
    R=X0'*X0;
    beta0=zeros(p,1);
    tau0=1;
    % SAMPLE INFORMATION
    n1=100;
    X=randn(n1,p-1);
    y=randn(n1,1);
    bsb=1:20;
    c=1.2;
    out=regressB(y, X, beta0, R, tau0, n0,'bsb',bsb,'c',c);

    Related Examples

    expand all

  • Example of the use of input option stats.
  • rng('default')
    rng(100) % set seed for replicability
    p=3;
    n0=30;
    X0=[ones(n0,1) randn(n0,p-1)];
    R=X0'*X0;
    beta0=zeros(p,1);
    tau0=1;
    % SAMPLE INFORMATION
    n1=100;
    X=randn(n1,p-1);
    y=randn(n1,1);
    bsb=1:20;
    stats=true;
    out=regressB(y, X, beta0, R, tau0, n0,'bsb',bsb,'stats',stats);

  • Example of Houses Price.
  • Compare the output with Table 3.3 of Koop (2004) p. 52.

    load hprice.txt;
    % setup parameters
    n=size(hprice,1);
    y=hprice(:,1);
    X=hprice(:,2:5);
    n0=5;
    % set \beta components
    beta0=0*ones(5,1);
    beta0(2,1)=10;
    beta0(3,1)=5000;
    beta0(4,1)=10000;
    beta0(5,1)=10000;
    % \tau
    s02=1/4.0e-8;
    tau0=1/s02;
    % R prior settings
    R=2.4*eye(5);
    R(2,2)=6e-7;
    R(3,3)=.15;
    R(4,4)=.6;
    R(5,5)=.6;
    R=inv(R);
    out=regressB(y, X, beta0, R, tau0, n0,'stats',1);
    disp(out)
             beta1: [5×1 double]
                a1: 273
                b1: 9.0326e+10
              tau1: 3.0224e-09
          covbeta1: [5×5 double]
           sigma21: 3.3208e+08
               res: [546×2 double]
          beta1HPD: [5×4 double]
           tau1HPD: [2.6745e-09 3.3913e-09 2.5720e-09 3.5143e-09]
        sigma21HPD: [2.9487e+08 3.7391e+08 2.8455e+08 3.8880e+08]
             Bpval: [5×1 double]
          postodds: [5×1 double]
         modelprob: [5×1 double]
    
    

    Input Arguments

    expand all

    y — Response variable. Vector.

    Response variable, specified as a vector of length n1, where n1 is the number of observations. Each entry in y is the response for the corresponding row of X.

    Missing values (NaN's) and infinite values (Inf's) are allowed, since observations (rows) with missing or infinite values will automatically be excluded from the computations.

    Data Types: single| double

    X — Predictor variables. Matrix.

    Matrix of explanatory variables (also called 'regressors') of dimension n1 x (p-1) where p denotes the number of explanatory variables including the intercept.

    Rows of X represent observations, and columns represent variables. By default, there is a constant term in the model, unless you explicitly remove it using input option intercept, so do not include a column of 1s in X. Missing values (NaN's) and infinite values (Inf's) are allowed, since observations (rows) with missing or infinite values will automatically be excluded from the computations.

    Remark: note that here we use symbol n1 instead of traditional symbol n because we want to better separate sample information coming from n1 values to prior information coming from n0 previous experiments.

    PRIOR INFORMATION.

    $\beta$ is assumed to have a normal distribution with mean $\beta_0$ and (conditional on $\tau_0$) covariance $(1/\tau_0) (X_0'X_0)^{-1}$.

    $\beta \sim N( \beta_0, (1/\tau_0) (X_0'X_0)^{-1} )$

    Data Types: single| double

    beta0 — Prior mean of $\beta$. p-times-1 vector.

    Data Types: single| double

    R — Matrix associated with covariance matrix of $\beta$. p-times-p positive definite matrix.

    It can be interpreted as X0'X0 where X0 is a n0 x p matrix coming from previous experiments (assuming that the intercept is included in the model) The prior distribution of $\tau_0$ is a gamma distribution with parameters $a_0$ and $b_0$, that is: \[ p(\tau_0) \propto \tau^{a_0-1} \exp (-b_0 \tau) \qquad E(\tau_0)= a_0/b_0 \]

    Data Types: single| double

    tau0 — Prior estimate of tau. Scalar.

    Prior estimate of $\tau=1/ \sigma^2 =a_0/b_0$.

    Data Types: single| double

    n0 — Number of previous experiments. Scalar.

    Sometimes it helps to think of the prior information as coming from n0 previous experiments. Therefore we assume that matrix X0 (which defines R), was made up of n0 observations.

    Data Types: single| double

    Name-Value Pair Arguments

    Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

    Example: 'intercept',false , 'bsb',[3 5] , 'c',1.2 , 'stats',1 , 'conflev',[0.99 0.999] , 'nocheck',true

    intercept —Indicator for constant term.true (default) | false.

    Indicator for the constant term (intercept) in the fit, specified as the comma-separated pair consisting of 'Intercept' and either true to include or false to remove the constant term from the model.

    Example: 'intercept',false

    Data Types: boolean

    bsb —units forming subset.vector.

    m x 1 vector.

    The default value of bsb is 1:n1, that is all n1 units are used to compute beta1 REMARK: if bsb='' (empty value) just prior information is used

    Example: 'bsb',[3 5]

    Data Types: double

    c —it can be used to control the prior information about beta.scalar.

    Scalar between 0 (excluded) and 1 (included).

    The covariance matrix of the prior distribution of $\beta$ is

    \[ (1/\tau_0)(c X_0'X_0)^{-1} = (1/\tau_0)(cR)^{-1} \]

    therefore multiplication of $R$ by $c$ (with $c<1$) increases the variance of $\beta$.

    The default value of $c$ is 1. Note also that if we adjust the information for $\beta$ multiplying its covariance matrix by $c$ we need to make a similar adjustment to the variance of $\tau$. Simple mathematical calculations show that the new variance of $\tau$ is equal to the old variance of $\tau$ divided by $c$. So, once again, given that $c < 1$, the new variance of $\tau$ increases.

    The value of $c$ will usually be the default, i,e., 1.

    It was included to clarify the difference from Zellner's procedure and has the effect of changing n0. For we can write n0' = cn0 and use n0' in the calculations. However, it may be useful in scaling X0 if the prior information is chosen according to a design that is not of the appropriate size to represent the amount of prior knowledge.

    Example: 'c',1.2

    Data Types: double

    stats —additional statistics.scalar.

    If stats =1 the following additional statistics are computed: 1) Bayesian p-values.

    2) highest posterior density intervals (HPDI) for each value of input option conflev.

    3) posterior odds for $\beta_j=0$.

    4) posterior model probability of the model which excludes variable j

    Example: 'stats',1

    Data Types: double

    conflev —confidence levels to be used to compute HPDI.vector.

    This input option is used just if input stats=1. The default value of conflev is [0.95 0.99] that is 95 per cent and 99 per cent HPDI confidence intervals are computed

    Example: 'conflev',[0.99 0.999]

    Data Types: double

    nocheck —Check input arguments.boolean.

    If nocheck is equal to true no check is performed on matrix y and matrix X. Notice that y and X are left unchanged. In other words the additional column of ones for the intercept is not added. As default nocheck=false.

    Example: 'nocheck',true

    Data Types: boolean

    Output Arguments

    expand all

    out — description Structure

    Structure which contains the following fields

    Value Description
    beta1

    p x 1 vector containing posterior mean (conditional on $\tau_0$) of $\beta$ (regression coefficents) $ \beta_1 = (c \times R + X'X)^{-1} (c \times R \times \beta_0 + X'y)$

    a1

    shape parameter of the posterior gamma distribution of $\tau$ ($\sigma^2$). Scalar.

    $a1 = 0.5 (c*n_0 + n_1)$ The posterior distribution of $\tau$ is a gamma distribution with parameters $a_1$ and $b_1$.

    The posterior distribution of $\sigma^2$ is an inverse-gamma distribution with parameters $a_1$ and $b_1$.

    b1

    scale parameter of the posterior gamma distribution of $\tau$ ($\sigma^2$). Scalar.

    \[ b1 = 0.5 * ( n_0 / \tau_0 + (y-X*\beta_1)'y +(\beta_0-\beta_1)'*c*R*\beta_0 ) \]

    The posterior distribution of $\tau$ is a gamma distribution with parameters $a_1$ and $b_1$.

    The posterior distribution of $\sigma^2$ is an inverse-gamma distribution with parameters $a_1$ and $b_1$.

    Remark: note that if bsb is supplied X'X and X'y and n1 are transformed respectively into Xm'Xm and Xm'ym and m where Xm=X(bsb,:), ym=y(bsb) and m=length(bsb), therefore all the previous quantities are estimated just using the units forming subset

    tau1

    posterior estimate of $\tau$. Scalar.

    $\tau_1$ is obtained as $a_1/b_1$

    covbeta1

    p x p matrix containing covariance matrix (conditional on $\tau_1$) of $\beta_1$ $covbeta1 = cov(\beta_1)= (1/\tau_1) * (c*R + X'X)^{-1}$ where $\tau_1$ is obtained as $a_1/b_1$ (that is through the gamma parameters of the posterior distribution of $\tau$)

    sigma21

    posterior estimate of $\sigma^2$. Scalar. $\sigma^2_1$ is obtained as $b_1/(a_1-1)$ (that is through the inverse gamma parameters of the posterior distribution of $\sigma^2$).

    Remember that if $X\sim IG(a,b)$, then $E(X)=b/(a-1)$.

    res

    n1-times-2 matrix.

    1st column = raw residuals res(i,1) is computed as y(i) - X(i,:)*beta1 where beta1 is computed using the units forming subset.

    In the Bayesian approach they are the posterior means of the $\epsilon_i$ and can be interpreted as point estimates of the $\epsilon_i$.

    2nd col = deletion residuals (just computed for the units which do not form the subset).

    res(i,2) with i \not \in subset is computed as (y(i)-X(i,:)*beta1) * sqrt ((a1/b1)/(1+hii)) where hii = X(i,:)* (c*R + Xm'*Xm)^{-1} * X(i,:)'.

    The additional output which follows is produced just if input scalar stats is equal 1

    beta1HPD

    p-by-2*length(conflev) matrix HPDI of \hat \beta.

    (HPDI stands for Highest Posterior Density Interval) 1st column =lower bound of HPDI associated with conflev(1).

    2st column =upper bound of HPDI associated with conflev(1).

    ...

    2*length(conflev)-1 column = lower bound of HPDI associated with conflev(end).

    2*length(conflev) column (last column) = upper bound of HPDI associated with conflev(end).

    tau1HPD

    1-by-2*length(conflev) matrix HPDI of $\hat \tau_1$.

    1st element =lower bound of HPDI associated with conflev(1).

    2st element =upper bound of HPDI associated with conflev(1).

    ...

    2*length(conflev)-1 element = lower bound of HPDI associated with conflev(end).

    2*length(conflev) element (last element) = upper bound of HPDI associated with conflev(end).

    Remark: confidence levels are based on the Gamma distribution

    sigma21HPD

    1-by-2*length(conflev) matrix HPDI of $\hat \sigma^2_1$.

    1st element =lower bound of HPDI associated with conflev(1).

    2st element =upper bound of HPDI associated with conflev(1).

    ...

    2*length(conflev)-1 element = lower bound of HPDI associated with conflev(end).

    2*length(conflev) element (last element) = upper bound of HPDI associated with conflev(end).

    Remark: confidence levels are based on the inverse-gamma distribution

    Bpval

    p-by-1 vector containing Bayesian p-values.

    p-value = P(|t| > | \hat \beta /se(\hat beta) |) = prob. of beta different from 0

    postodds

    p-by-1 vector which contains posterior odds for betaj=0.

    For example the posterior odd of betaj=0 is p(y| model which contains all expl variables except the one associated with betaj) divided by p(y| model which contains all expl variables).

    Warning: postodds can be computed just if n1>=p

    modelprob

    p-by-1 vector which contains posterior model probability of the model which excludes variable j.

    For example if modelprob(j)= 0.28, that is if the probability of the model which does not contain variable j is equal to 0.28, it means that there is a 28 per cent chance that $\beta_j=0$ and a 72 per cent chance that it is not.

    Warning: modelprob can be computed just if n1>=p

    More About

    expand all

    Additional Details

    The density of the gamma distribution (which we denote with $G(a,b)$) with parameters $a$ and $b$ is \[ f_G(x,a,b)= \frac{b^a}{\Gamma(a)} x^{a-1} \exp(-b x) \]

    With this parametrization $G(a,b)$ has mean $a/b$ and variance $a/b^2$.

    The density of the inverse gamma distribution defined over the support $x>0$ with shape parameter $a$ and scale parameter $b$, which we denote with $IG(a, b)$ is

    \[ f_{IG}(x, a, b) = \frac{b^a}{\Gamma(a)} (1/x)^{a +1} \exp (-b/x) \]

    The mean (for $a>1$) is $b/(a-1)$ and the variance (for $a>2$) is $\frac{b^2}{(a-1)^2(a-2)}$.

    With the above parameterizations if $X\sim G(a,b)$, then $Y=1/X$ has an $IG(a, b)$.

    The posterior distribution of $\beta$ conditional on $\tau$ is $N(\hat \beta_1, (1/\tau) (R+ X^TX)^{-1})$ where \begin{eqnarray*} \hat \beta_1 &=& (R+ X^TX)^{-1} (R \beta_0+X^T y) \\ &=& (R+ X^TX)^{-1} (R \beta_0 + X^TX \hat \beta) \\ &=& (I-A) \beta_0 + A \hat \beta, \end{eqnarray*} with $A= (R+ X^TX)^{-1} X^TX$. This last expression shows that the posterior estimate $\hat \beta_1$ is a matrix weighted average of the prior mean $\beta_0$ and the classical OLS estimate $\hat \beta$, with weights $I-A$ and $A$. If prior information is strong elements of $R$ will be large, and $A$ will be small, so that the posterior mean gives most weight to the prior mean. In the classical approach these weights are fixed, while with the forward search technique as the subset size grows, the weight assigned to A becomes stronger and stronger, so we can dynamically see how the estimate changes as the effect of the prior becomes less important.

    The posterior distribution of $\tau$ is $G(a_1, b_1)$ where \begin{equation}\label{a1} a_1=a+\frac{n}{2}= \frac{1}{2} (n_0 + n -p) \end{equation} and \begin{equation}\label{b1} b_1 = 0.5 \left( \frac{n_0-p}{\tau_0} + (y-X \beta_1)'y +(\beta_0-\beta_1)'R\beta_0 \right) \end{equation} and, analogously, the posterior distribution of $\sigma^2$ is $IG(a_1, b_1)$.

    The posterior estimates of $\tau$ and $\sigma^2$ are respectively:

    \[ \hat \tau_1 = a_1/b_1, \qquad \hat \sigma^2_1 = b_1 /(a_1-1) \] The posterior marginal distribution of $\beta$ is multivariate $T$ with parameters \[ \hat \beta_1, (1/\tau_1) \{R + X^T X \}^{-1}, n_0 +n - p \]

    References

    Chaloner, K. and Brant, R. (1988), A Bayesian Approach to Outlier Detection and Residual Analysis, "Biometrika", Vol. 75, pp. 651-659.

    Koop G. (2003), "Bayesian Econometrics", Wiley, New York. [Chapt. 3]

    See Also

    |

    This page has been automatically generated by our routine publishFS