qsreg                 package:fields                 R Documentation

_Q_u_a_n_t_i_l_e _o_r _R_o_b_u_s_t _s_p_l_i_n_e _r_e_g_r_e_s_s_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Uses a penalized likelihood approach to estimate the conditional  
     quantile function for regression data. This method is only
     implemented   for univariate data. For the pairs (X,Y) the  
     conditional quantile, f(x), is  P( Y<f(x)| X=x) = alpha. This
     estimate   is useful for determining the envelope of a scatterplot
     or assessing   departures from a constant variance with respect to
     the independent   variable.

_U_s_a_g_e:

     qsreg(x, y, lam = NA, maxit = 50, maxit.cv = 10, tol =
                      1e-07, offset = 0, sc = sqrt(var(y)) * 1e-05, alpha =
                      0.5, wt = rep(1, length(x)), cost = 1, nstep.cv = 80,
                      hmin = NA, hmax = NA, trmin = 2 * 1.05, trmax = 0.95
                      * length(unique(x)))

_A_r_g_u_m_e_n_t_s:

       x: Vector of the independent variable in  y = f(x) + e

       y: Vector of the dependent variable

     lam: Values of the smoothing parameter. If omitted is found by GCV
          based on the   the quantile criterion  

   maxit: Maximum number of iterations used to estimate each quantile
          spline.  

maxit.cv: Maximum number of iterations to find GCV minimum.  

     tol: Tolerance for convergence when computing quantile spline.  

    cost: Cost value used in the GCV criterion. Cost=1 is the usual GCV
            denominator.  

  offset: Constant added to the effective degrees of freedom in the GCV
          function.   

      sc: Scale factor for rounding out the absolute value function at
          zero to a  quadratic. Default is a small scale to produce
          something more like  quantiles. Scales on the order of the
          residuals will result is a robust  regression fit using the
          Huber weight function. The default is 1e-5 of the  variance
          of the Y's. The larger this value the better behaved the
          problem  is numerically and requires fewer iterations for
          convergence at each new  value of lambda.  

   alpha: Quantile to be estimated. Default is find the median.  

      wt: Weight vector default is constant values. Passing nonconstant
          weights is a  pretty strange thing to do.   

nstep.cv: Number of points used in CV grid search  

    hmin: Minimum value of log( lambda) used for GCV grid search.  

    hmax: Maximum value of log( lambda) used for GCV grid search.  

   trmin: Minimum value of effective degrees of freedom in model used 
          for specifying the range of lambda in the GCV grid search.  

   trmax: Maximum value of effective degrees of freedom in model used 
          for specifying the range of lambda in the GCV grid search.  

_D_e_t_a_i_l_s:

     This is an experimental function to find the smoothing parameter
     for a   quantile or robust spline using a more appropriate
     criterion than mean squared   error prediction.   The quantile
     spline is found by an iterative algorithm using weighted   least
     squares cubic splines. At convergence the estimate will also be a 
      weighted natural  cubic spline but the weights will depend on the
      estimate.  Alternatively at convergence the estimate will be a
     least squares spline applied to the  empirical psuedo data. The
     user is referred to the paper by Oh and Nychka ( 2002) for the 
     details and properties of the robust cross-validation using
     empirical psuedo data. Of course these weights are crafted so that
     the resulting spline is an   estimate of the alpha quantile
     instead of the mean. CV as function of  lambda can be strange so
     it should be plotted.

_V_a_l_u_e:

trmin trmax : Define the minimum and maximum values for the CV grid
          search in terms of  the effective number of parameters. (see
          hmin, hmax)  Object of class qsreg with many arguments
          similar to a sreg object.   One difference is that cv.grid
          has five columns the last being   the number of iterations
          for convergence at each value of lambda.   

_S_e_e _A_l_s_o:

     'sreg'

_E_x_a_m_p_l_e_s:

          # fit a CV  quantile spline
          fit50<- qsreg(rat.diet$t,rat.diet$con)
          # (default is .5 so this is an estimate of the conditional median)
          # control group of rats.
          plot( fit50)
          predict( fit50)
          # predicted values at data points
          xg<- seq(0,110,,50)
          plot( fit50$x, fit50$y)
          lines( xg, predict( fit50, xg))

          # A robust fit to rat diet data
          # 
          SC<- .5* median(abs((rat.diet$con- median(rat.diet$con))))
          fit.robust<- qsreg(rat.diet$t,rat.diet$con, sc= SC)
          plot( fit.robust)

          # The global GCV function suggests little smoothing so 
          # try the local
          # minima with largest lambda instead of this default value.
          # one should should consider redoing the three quantile fits in this
          # example after looking at the cv functions and choosing a good value for
          #lambda
          # for example
          lam<- fit50$cv.grid[,1]
          tr<- fit50$cv.grid[,2]
          # lambda close to df=6
          lambda.good<- max(lam[tr>=6])
          fit50.subjective<-qsreg(rat.diet$t,rat.diet$con, lam= lambda.good)
          fit10<-qsreg(rat.diet$t,rat.diet$con, alpha=.1, nstep.cv=200)
          fit90<-qsreg(rat.diet$t,rat.diet$con, alpha=.9, nstep.cv=200)
          # spline fits at 50 equally spaced points
          sm<- cbind(
      
          predict( fit10, xg),
          predict( fit50.subjective, xg),predict( fit50, xg),
          predict( fit90, xg))
      
          # and now zee data ...
          plot( fit50$x, fit50$y)
          # and now zee quantile splines at 10
          #
          matlines( xg, sm, col=c( 3,3,2,3), lty=1) # the spline
       

