cphidden              package:repeated              R Documentation

_C_h_a_n_g_e_p_o_i_n_t _L_o_c_a_t_i_o_n _u_s_i_n_g _a _C_o_n_t_i_n_u_o_u_s-_t_i_m_e _T_w_o-_s_t_a_t_e _H_i_d_d_e_n _M_a_r_k_o_v _C_h_a_i_n

_D_e_s_c_r_i_p_t_i_o_n:

     'cphidden' fits a two-state hidden Markov chain model with a
     variety of distributions in continuous time in order to locate a
     changepoint in the chosen distribution. All series on different
     individuals are assumed to start at the same time point.

     For quantitative responses, specifying 'par' allows an `observed'
     autoregression to be fitted as well as the hidden Markov chain.

     All functions and formulae for the location parameter are on the
     (generalized) logit scale for the Bernoulli, binomial, and
     multinomial distributions.

     If 'cmu' and 'tvmu' are used, these two mean functions are
     additive so that interactions between time-constant and
     time-varying variables are not possible.

     The algorithm will run more quickly if the most frequently
     occurring time step is scaled to be equal to unity.

     The object returned can be plotted to give the probabilities of
     being in each hidden state at each time point. See 'hidden' for
     details. For distributions other than the multinomial,
     proportional odds, and continuation ratio, the (recursive)
     predicted values can be plotted using 'mprofile' and 'iprofile'.

_U_s_a_g_e:

     cphidden(response=NULL, totals=NULL, times=NULL, distribution="Bernoulli",
             mu=NULL, cmu=NULL, tvmu=NULL, pgamma, pmu=NULL, pcmu=NULL, ptvmu=NULL,
             pshape=NULL, pfamily=NULL, par=NULL, pintercept=NULL, delta=NULL,
             envir=parent.frame(), print.level=0, ndigit=10,
             gradtol=0.00001, steptol=0.00001, fscale=1, iterlim=100, typsiz=abs(p),
             stepmax=10*sqrt(p%*%p))

_A_r_g_u_m_e_n_t_s:

response: A list of two or three column matrices with counts or
          category indicators, times, and possibly totals (if the
          distribution is binomial), for each individual, one matrix or
          dataframe of counts, or an object of class, 'response'
          (created by 'restovec') or 'repeated' (created by 'rmna' or
          'lvna'). If the 'repeated' data object contains more than one
          response variable, give that object in 'envir' and give the
          name of the response variable to be used here. If there is
          only one series, a vector of responses may be supplied
          instead.

          Multinomial and ordinal categories must be integers numbered
          from 0.

  totals: If response is a matrix, a corresponding matrix of totals if
          the distribution is binomial. Ignored if response has class,
          'response' or 'repeated'.

   times: If 'response' is a matrix, a vector of corresponding times,
          when they are the same for all individuals. Ignored if
          response has class, 'response' or 'repeated'.

distribution: Bernoulli, Poisson, multinomial, proportional odds,
          continuation ratio, binomial, exponential, beta binomial,
          negative binomial, normal, inverse Gauss, logistic, gamma,
          Weibull, Cauchy, Laplace, Levy, Pareto, gen(eralized) gamma,
          gen(eralized) logistic, Hjorth, Burr, gen(eralized) Weibull,
          gen(eralized) extreme value, gen(eralized) inverse Gauss,
          power exponential, skew Laplace, or Student t. (For
          definitions of distributions, see the corresponding
          [dpqr]distribution help.)

      mu: A general location function with two possibilities: (1) a
          list of formulae (with parameters having different names) or
          functions (with one parameter vector numbering for all of
          them) each returning one value per observation; or (2) a
          single formula or function which will be used for all states
          (and all categories if multinomial) but with different
          parameter values in each so that pmu must be a vector of
          length the number of unknowns in the function or formula
          times the number of states (times the number of categories
          minus one if multinomial).

     cmu: A time-constant location function with three possibilities:
          (1) a list of formulae (with parameters having different
          names) or functions (with one parameter vector numbering for
          all of them) each returning one value per individual; (2) a
          single formula or function which will be used for all states
          (and all categories if multinomial) but with different
          parameter values in each so that pcmu must be a vector of
          length the number of unknowns in the function or formula
          times the number of states (times the number of categories
          minus one if multinomial); or (3) a function returning an
          array with one row for each individual, one column for each
          state of the hidden Markov chain, and, if multinomial, one
          layer for each category but the last. If used, this function
          or formula should contain the intercept. Ignored if 'mu' is
          supplied.

    tvmu: A time-varying location function with three possibilities:
          (1) a list of formulae (with parameters having different
          names) or functions (with one parameter vector numbering for
          all of them) each returning one value per time point; (2) a
          single formula or function which will be used for all states
          (and all categories if multinomial) but with different
          parameter values in each so that ptvmu must be a vector of
          length the number of unknowns in the function or formula
          times the number of states (times the number of categories
          minus one if multinomial); or (3) a function returning an
          array with one row for each time point, one column for each
          state of the hidden Markov chain, and, if multinomial, one
          layer for each category but the last. This function or
          formula is usually a function of time; it is the same for all
          individuals. It only contains the intercept if 'cmu' does
          not. Ignored if 'mu' is supplied.

  pgamma: An initial estimate of the transition intensity between the
          two states in the continuous-time hidden Markov chain.

     pmu: Initial estimates of the unknown parameters in 'mu'.

    pcmu: Initial estimates of the unknown parameters in 'cmu'.

   ptvmu: Initial estimates of the unknown parameters in 'tvmu'.

  pshape: Initial estimate(s) of the dispersion parameter, for those
          distributions having one. This can be one value or a vector
          with a different value for each state.

 pfamily: Initial estimate of the family parameter, for those
          distributions having one.

     par: Initial estimate of the autoregression parameter.

pintercept: For multinomial, proportional odds, and continuation ratio
          models, 'p-2' initial estimates for intercept contrasts from
          the first intercept, where 'p' is the number of categories.

   delta: Scalar or vector giving the unit of measurement (always one
          for discrete data) for each response value, set to unity by
          default. For example, if a response is measured to two
          decimals, delta=0.01. If the response is transformed, this
          must be multiplied by the Jacobian. For example, with a log
          transformation, 'delta=1/response'. Ignored if response has
          class, 'response' or 'repeated'.

   envir: Environment in which model formulae are to be interpreted or
          a data object of class, 'repeated', 'tccov', or 'tvcov'; the
          name of the response variable should be given in 'response'.
          If 'response' has class 'repeated', it is used as the
          environment.

  others: Arguments controlling 'nlm'.

_V_a_l_u_e:

     A list of classes 'hidden' and 'recursive' (unless multinomial,
     proportional odds, or continuation ratio) is returned that
     contains all of the relevant information calculated, including
     error codes.

_A_u_t_h_o_r(_s):

     J.K. Lindsey

_S_e_e _A_l_s_o:

     'chidden', 'gar', 'gnlmm', 'hidden', 'iprofile', 'kalcount',
     'mexp', 'mprofile', 'nbkal', 'read.list', 'restovec', 'rmna'.

_E_x_a_m_p_l_e_s:

     # model for one randomly-generated binary series
     y <- c(rbinom(10,1,0.1), rbinom(10,1,0.9))
     mu <- function(p) array(p, c(1,2))
     print(z <- cphidden(y, times=1:20, dist="Bernoulli",
             pgamma=0.1,cmu=mu, pcmu=c(-2,2)))
     # or equivalently
     print(z <- cphidden(y, times=1:20, dist="Bernoulli",
             pgamma=0.2,cmu=~1, pcmu=c(-2,2)))
     # or
     print(z <- cphidden(y, times=1:20, dist="Bernoulli",
             pgamma=0.2,mu=~rep(a,20), pmu=c(-2,2)))
     mexp(z$gamma)
     par(mfcol=c(2,2))
     plot(z)
     plot(iprofile(z), lty=2)
     print(z <- cphidden(y, times=(1:20)*2, dist="Bernoulli",
             pgamma=0.1,cmu=~1, pcmu=c(-2,2)))
     mexp(z$gamma) %*% mexp(z$gamma)
     plot(z)
     plot(iprofile(z), lty=2)

