rseq                  package:cosmo                  R Documentation

_R_a_n_d_o_m _g_e_n_e_r_a_t_i_o_n _o_f _D_N_A _s_e_q_u_e_n_c_e _a_c_c_o_r_d_i_n_g _t_o _Z_O_O_P_S _o_r _T_C_M _m_o_d_e_l

_D_e_s_c_r_i_p_t_i_o_n:

     This function randomly generates a number of DNA sequences that
     contain a given motif accordint to the ZOOPS or TCM model. In the
     ZOOPS model, each sequence contains one or zero occurrences of the
     motif. In the TCM model, each sequence may contain an arbitrary
     number of motif occurrences.

_U_s_a_g_e:

     rseq(numSeqs, seqLength, rate, pwm, transMats,
          model="ZOOPS", posOnly=FALSE)

_A_r_g_u_m_e_n_t_s:

 numSeqs: 'numeric' The number of sequences to be generated

seqLength: 'numeric' The length of each sequence. This may be either a
          single number, in which case that number is taken to be the
          common length of all sequence, or a vector of sequence
          lengths.

    rate: 'numeric' In the ZOOPS model, this is the proportion of
          sequences containg a motif occurrence. In the TCM model, this
          the rate parameter lambda with which motifs are inserted into
          the sequences.

     pwm: 'numeric' Position-weight matrix of the motif to be inserted.

transMats: The transition matrices to use for the background Markov
          model. This is a list of matrices, with the first matrix
          given the transition probabilities for the 0th order Markov
          model, the second matrix giving the transition probabilities
          for a 1st order Markov model, and so on.

   model: Either "ZOOPS" or "TCM"

 posOnly: 'logical' If TRUE, motifs are inserted only in the forwards
          orientation. Otherwise, motifs are inserted in either of the
          two possible orientations with equal probabilities.

_V_a_l_u_e:

    seqs: A list with one element for each sequence in the file. The
          elements  are in two parts, one the description and the
          second a character string of the biological sequence.

  motifs: An "align" object summarizing the positions of the inserted
          motif occurrences.

  empPWM: An object of class 'pwm' representing the position weight
          matrix obtained by aligning the inserted motifs.

_A_u_t_h_o_r(_s):

     Oliver Bembom, bembom@berkeley.edu

_E_x_a_m_p_l_e_s:

     ## generate 20 sequences according to ZOOPS model
     ## with an expected number of 10 sequences containing a
     ## motif

     data(motifPWM)
     data(transMats)
     res <- rseq(20, 250, 0.5, motifPWM, transMats,"ZOOPS")

