BSgenomeForge            package:BSgenome            R Documentation

_T_h_e _B_S_g_e_n_o_m_e_F_o_r_g_e _f_u_n_c_t_i_o_n_s

_D_e_s_c_r_i_p_t_i_o_n:

     A set of functions for making a BSgenome data package.

_U_s_a_g_e:

       ## Top-level BSgenomeForge function:

       forgeBSgenomeDataPkg(x, seqs_srcdir=".", masks_srcdir=".", destdir=".", verbose=TRUE)

       ## Low-level BSgenomeForge functions:

       forgeSeqlengthsFile(seqnames, prefix="", suffix=".fa",
                           seqs_srcdir=".", seqs_destdir=".", verbose=TRUE)

       forgeSeqFiles(seqnames, mseqnames=NULL, prefix="", suffix=".fa",
                     seqs_srcdir=".", seqs_destdir=".", verbose=TRUE)

       forgeMasksFiles(seqnames, nmask_per_seq,
                       seqs_destdir=".", masks_srcdir=".", masks_destdir=".",
                       AGAPSfiles_type="gap", AGAPSfiles_name=NA,
                       AGAPSfiles_prefix="", AGAPSfiles_suffix="_gap.txt",
                       RMfiles_name=NA, RMfiles_prefix="", RMfiles_suffix=".fa.out",
                       TRFfiles_name=NA, TRFfiles_prefix="", TRFfiles_suffix=".bed",
                       verbose=TRUE)

_A_r_g_u_m_e_n_t_s:

       x: A BSgenomeDataPkgSeed object or the name of a BSgenome data
          package seed file. See the BSgenomeForge vignette in this
          package for more information. 

seqs_srcdir, masks_srcdir: Single strings indicating the path to the
          source directories i.e. to the directories containing the
          source data files. Only read access to these directories is
          needed. See the BSgenomeForge vignette in this package for
          more information. 

 destdir: A single string indicating the path to the directory where
          the source tree of the target package should be created. This
          directory must already exist. See the BSgenomeForge vignette
          in this package for more information. 

 verbose: 'TRUE' or 'FALSE'. 

seqnames, mseqnames: A character vector containing the names of the
          single (for 'seqnames') and multiple (for 'mseqnames')
          sequences to forge. See the BSgenomeForge vignette in this
          package for more information. 

prefix, suffix: See the BSgenomeForge vignette in this package for more
          information, in particular the description of the
          'seqfiles_prefix' and 'seqfiles_suffix' fields of a BSgenome
          data package seed file. 

seqs_destdir, masks_destdir: During the forging process the source data
          files are converted into serialized Biostrings objects.
          'seqs_destdir' and 'masks_destdir' must be single strings
          indicating the path to the directories where these serialized
          objects should be saved. These directories must already
          exist.

          'forgeSeqlengthsFile' will produce a single .rda file. Both
          'forgeSeqFiles' and 'forgeMasksFiles' will produce one .rda
          file per sequence. 

nmask_per_seq: A single integer indicating the desired number of masks
          per sequence. See the BSgenomeForge vignette in this package
          for more information. 

AGAPSfiles_type, AGAPSfiles_name, AGAPSfiles_prefix, AGAPSfiles_suffix,
RMfiles_name, RMfiles_prefix, RMfiles_suffix,
TRFfiles_name, TRFfiles_prefix, TRFfiles_suffix: 
          These arguments are named accordingly to the corresponding
          fields of a BSgenome data package seed file. See the
          BSgenomeForge vignette in this package for more information. 

_D_e_t_a_i_l_s:

     These functions are intended for Bioconductor users who want to
     make a new BSgenome data package, not for regular users of these
     packages. See the BSgenomeForge vignette in this package
     ('vignette("BSgenomeForge")') for an extensive coverage of this
     topic.

_A_u_t_h_o_r(_s):

     H. Pages

_E_x_a_m_p_l_e_s:

       forgeSeqFiles("chrM", prefix="ce2", suffix=".fa",
                     seqs_srcdir=system.file("extdata", package="BSgenome"),
                     seqs_destdir=tempdir())
       load(file.path(tempdir(), "chrM.rda"))
       chrM

