bgModel                package:cosmo                R Documentation

_E_s_t_i_m_a_t_i_n_g _t_h_e _b_a_c_k_g_r_o_u_n_d _M_a_r_k_o_v _m_o_d_e_l

_D_e_s_c_r_i_p_t_i_o_n:

     bgModel() obtains an estimate of the Markov model used by cosmo()
     for modeling the distribution of nucleotides that are not part of
     the motif. bgModel() can select the order of this model
     data-adaptively by likelihood-based cross-validation (a k-th order
     Markov model allows the probability of encountering the four
     different nucleotides in a given position to depend on the k
     previous nucleotides).

_U_s_a_g_e:

       bgModel(seqs, order=NULL, fold = 5, maxOrder = 6) 

_A_r_g_u_m_e_n_t_s:

    seqs: This argument specifies the sequences that are to be used to
          estimate the background Markov model. If seqs == "browse", a
          browser appears that allows the user to select a file that
          contains the sequences in FASTA format. If seqs is another
          character string, it is assumed to give the path to a FASTA
          file containing the sequences of interest. Lastly, seqs may
          be a list with each element representing a sequence in the
          form of a single string such as "ACGTAGCTAG" ("seq" entry)
          and a description ("desc" entry).

   order: 'numerical' The order of the Markov background model. If this
          argument is NULL, the order is selected data-adaptively by
          likelihood-based cross-validation. Otherwise, a Markov model
          for the specified order is estimated.

    fold: 'numerical' cross-validation fold for selecting order of
          background Markov model

maxOrder: 'numerical' Maximum order to consider for Markov background
          model.

_V_a_l_u_e:

     A list with the folowing elements: 

transMat: The estimated transition matrix for the  background Markov
          model. This is a list of matrices, with the first matrix
          given the transition probabilities for the 0th order Markov
          model, the second matrix giving the transition probabilities
          for a 1st order Markov model, and so on.

   order: The selected order of the background Markov model.

  klDivs: The Kullback-Leibler divergences for the different candidate
          orders for the background Markov model. Likelihood-based
          cross-validation selects the order with the minimum
          Kullback-Leibler divergence.

_A_u_t_h_o_r(_s):

     Oliver Bembom, bembom@berkeley.edu

_S_e_e _A_l_s_o:

     'cosmo'

_E_x_a_m_p_l_e_s:

     ## path to example sequence file in FASTA format
     seqFile <- system.file("Exfiles","seq.fasta",package="cosmo")

     ## estimate transition matrix for order 2
     tmat1 <- bgModel(seqFile, order=2)

     ## select order data-adaptively
     tmat2 <- bgModel(seqFile)

