Mmusculus {BSgenome.Mmusculus.UCSC.mm8}R Documentation

Mus musculus full genome (UCSC version mm8)

Description

Mus musculus full genome as provided by UCSC (mm8, Feb. 2006) and stored in Biostrings objects.

Note

This BSgenome data package was made from the following source data files:

sequences: chromFa.tar.gz, upstream1000.fa.gz, upstream2000.fa.gz, upstream5000.fa.gz
from http://hgdownload.cse.ucsc.edu/goldenPath/mm8/bigZips/
AGAPS masks: all the chr*_gap.txt.gz files from ftp://hgdownload.cse.ucsc.edu/goldenPath/mm8/database/
RM masks: http://hgdownload.cse.ucsc.edu/goldenPath/mm8/bigZips/chromOut.tar.gz
TRF masks: http://hgdownload.cse.ucsc.edu/goldenPath/mm8/bigZips/chromTrf.tar.gz
See ?BSgenomeForge and the BSgenomeForge vignette (vignette("BSgenomeForge")) in the BSgenome software package for how to make a BSgenome data package.

Author(s)

H. Pages

See Also

BSgenome-class, DNAString-class, available.genomes, BSgenomeForge

Examples

Mmusculus
seqlengths(Mmusculus)
Mmusculus$chr1  # same as Mmusculus[["chr1"]]

if ("AGAPS" %in% masknames(Mmusculus)) {

  ## Check that the assembly gaps contain only Ns:
  checkOnlyNsInGaps <- function(seq)
  {
    ## Replace all masks by the inverted AGAPS mask
    masks(seq) <- gaps(masks(seq)["AGAPS"])
    af <- alphabetFrequency(seq)
    found_letters <- names(af)[af != 0]
    if (any(found_letters != "N"))
        stop("assembly gaps contain more than just Ns")
  }

  ## A message will be printed each time a sequence is removed
  ## from the cache:
  options(verbose=TRUE)

  for (seqname in seqnames(Mmusculus)) {
    cat("Checking sequence", seqname, "... ")
    seq <- Mmusculus[[seqname]]
    checkOnlyNsInGaps(seq)
    cat("OK\n")
  }
}

## See the GenomeSearching vignette in the BSgenome software
## package for some examples of genome-wide motif searching using
## Biostrings and the BSgenome data packages:
if (interactive())
    vignette("GenomeSearching", package="BSgenome")

[Package BSgenome.Mmusculus.UCSC.mm8 version 1.3.11 Index]