rlsmo

rlsmo computes a running-lines smoother with global cross-validation.

Syntax

Description

This function is called in each step of the avas function but it can be called directly when it is necessary to smooth a set of values using local regressions. Note that the x values must be non decreasing.

example

smo =rlsmo(x, y) rlsmo with all the default arguments.

example

smo =rlsmo(x, y, w) rlsmo with weights.

example

smo =rlsmo(x, y, w, span) rlsmo with span value supplied as input.

example

[smo, span] =rlsmo(___) rlsmo called with two outputs.

Examples

expand all

  • rlsmo with all the default arguments.
  • n=200;
    x=sort(randn(n,1));
    y=2*x.^2+-3*x+2*randn(n,1);
    ysmo=rlsmo(x,y);
    plot(x,[y ysmo])

  • rlsmo with weights.
  • n=200;
    x=sort(randn(n,1));
    y=2*x.^2+-3*x+2*randn(n,1);
    w=1:n; w=w(:);
    [ysmo,span]=rlsmo(x,y,w);
    plot(x,[y ysmo])
    title(['span chosen by cross validation= ' num2str(span)])

  • rlsmo with span value supplied as input.
  • n=200;
    x=sort(randn(n,1))*10;
    y=3*x.^3-2*x.^2+-4*x+10000*randn(n,1);
    [ysmo,span]=rlsmo(x,y,[],0.5);
    plot(x,[y ysmo])
    title(['Fixed value of span = ' num2str(span)])

  • rlsmo called with two outputs.
  • n=200;
    x=sort(randn(n,1));
    y=2*x.^2+-3*x+2*randn(n,1);
    [ysmo,span]=rlsmo(x,y);
    plot(x,[y ysmo])
    title(['span chosen by cross validation= ' num2str(span)])

    Input Arguments

    expand all

    x — Predictor variable sorted. Vector.

    Ordered abscissa values.

    Note that the x values are assumed non decreasing.

    Data Types: single| double

    y — Response variable. Vector.

    Response variable which has to be smoothed, specified as a vector of length n, where n is the number of observations.

    Data Types: single| double

    Optional Arguments

    w — weights for the observations. Vector.

    Row or column vector of length n containing the weights associated to each observations. If w is not specified we assum $w=1$ for $i=1, 2, \ldots, n$.

    Example: 1:n

    Data Types: double

    span — length of the local regressions. Scalar.

    Scalar in the interval [0, 1] which specifies the length of the local regressions. If span is 0 (default value) the fractions of observations which are considered for computing the local regressions are roughly $cvspan=n*[0.3,0.4,0.5,0.6,0.7,1.0]$.

    The element of $cvspan$ which is associated with the smallest cross validation residual sum of squares is chosen. The smoothing procedure is called using the best value of cvspan and the smoothed values are found without cross validation.

    If span is not 0 but is a value in the interval (0, 1], the local regression have length n*span and the smoothed values are found without cross validation.

    Example: 0.4

    Data Types: double

    Output Arguments

    expand all

    smo —smoothed values. Vector

    A vector with the same dimension of y containing smoothed values, that is the y values on the fitted curve. The smoothed values come from linear local linear regressions whose length is specified by input parameter span.

    span —length of the local regressions. Scalar

    Scalar in the interval [0, 1] which specifies the length of the local regressions which has been used. For example if span=0.3 approximately 30 per cent of consecutive observations are used in order to compute the local regressions.

    More About

    expand all

    Additional Details

    This function makes use of subroutine smth.

    The sintax of $smth$ is $[smo] = smth(x,y,w,span,cross)$. $x$, $y$ and $w$ are 3 vectors of length $n$ containing respectively the $x$ coordinates, the $y$ coordinates and the weights. Input paramter $span$ is a scalar in the interval (0 1] which defines the length of the elements in the local regressions.

    More precisely, if $span$ is in (0 1), the length of elements in the local regressions is $m*2+1$, where $m$ is defined as the $\max([(n \times span)/2],1)$ to ensure that minimum length of the local regression is 3. Symbol $[ \cdot ]$ denotes the integer part.

    Parameter $cross$ is a Boolean scalar. If it is set to true, it specifies that, to compute the local regression centered on unit $i$, unit $i$ must be deleted. Therefore for example,


    [1] if $m$ is 3 and $cross$ is true, the smoothed value for observation $i$ uses a local regression with $x$ coordinates $(x(i-1), x(i+1))$, $y$ coordinates $(y(i-1), y(i+1))$ and $w$ coordinates $(w(i-1), w(i+1))$, $i=2, \ldots, n-1$. The smoothed values for observation 1 is $y(2)$ and the smoothed value for observation $n$ is $y(n-1)$.


    [2] If $m$ is 3 and $cross$ is false, the smoothed value for observations $i$ is based on a local regression with $x$ coordinates $(x(i-1), x(i), x(i+1))$, $y$ coordinates $(y(i-1), y(i), y(i+1))$ and $w$ coordinates $(w(i-1), w(1), w(i+1))$, $i=2, \ldots, n-1$. The smoothed values for observation 1 uses a local regression based on $(x(1), x(2))$, $(y(1), y(2))$, and $(w(1), w(2))$ while the smoothed value for observation $n$ uses a local regression based on $(x(n-1), x(n))$, $(y(n-1), y(n))$, and $(w(n-1), w(n))$.


    [3] If $m=5$ and $cross$ is true, the smoothed value for observations $i$ uses a local regression based on observations $(i-2), (i-1), (i+1), (i+2)$, for $i=3, \ldots, n-2$. The smoothed values for observation 1 uses observations 2 and 3, the smoothed value for observations 2 uses observations 1, 3 and 4 ...


    [4] If $m$ is 5 and $cross$ is false, the smoothed value for observations $i$ uses a local regression based on observations $(i-2), (i-1), i, (i+1), (i+2)$, for $i=3, \ldots, n-2$.

    The smoothed values for observation 1 uses observations 1, 2 and 3, the smoothed value for observations 2 uses observations 1, 2, 3 and 4 ...


    References

    Tibshirani R. (1987), Estimating optimal transformations for regression, "Journal of the American Statistical Association", Vol. 83, 394-405.

    Hastie, T., and Tibshirani, R. (1986), Generalized Additive Models (with discussion), "Statistical Science", Vol 1, pp. 297-318

    See Also

    | | |

    This page has been automatically generated by our routine publishFS