Hsapiens {BSgenome.Hsapiens.UCSC.hg17}R Documentation

Homo sapiens full genome (UCSC version hg17)

Description

Homo sapiens full genome as provided by UCSC (hg17, May 2004) and stored in Biostrings objects.

Note

This BSgenome data package was made from the following source data files:

sequences: chromFa.zip, upstream1000.zip, upstream2000.zip, upstream5000.zip
from http://hgdownload.cse.ucsc.edu/goldenPath/hg17/bigZips/
masks: all the chr*_gap.txt.gz files from ftp://hgdownload.cse.ucsc.edu/goldenPath/hg17/database/
+ chromOut.zip and chromTrf.zip from http://hgdownload.cse.ucsc.edu/goldenPath/hg17/bigZips/
See ?BSgenomeForge and the BSgenomeForge vignette (vignette("BSgenomeForge")) in the BSgenome software package for how to make a BSgenome data package.

Author(s)

H. Pages

See Also

BSgenome-class, DNAString-class, available.genomes, BSgenomeForge

Examples

Hsapiens
seqlengths(Hsapiens)
Hsapiens$chr1  # same as Hsapiens[["chr1"]]

if ("AGAPS" %in% masknames(Hsapiens)) {

  ## Check that the assembly gaps contain only Ns:
  checkOnlyNsInGaps <- function(seq)
  {
    ## Replace all masks by the inverted AGAPS mask
    masks(seq) <- gaps(masks(seq)["AGAPS"])
    af <- alphabetFrequency(seq)
    found_letters <- names(af)[af != 0]
    if (any(found_letters != "N"))
        stop("assembly gaps contain more than just Ns")
  }

  ## A message will be printed each time a sequence is removed
  ## from the cache:
  options(verbose=TRUE)

  for (seqname in seqnames(Hsapiens)) {
    cat("Checking sequence", seqname, "... ")
    seq <- Hsapiens[[seqname]]
    checkOnlyNsInGaps(seq)
    cat("OK\n")
  }
}

## See the GenomeSearching vignette in the BSgenome software
## package for some examples of genome-wide motif searching using
## Biostrings and the BSgenome data packages:
if (interactive())
    vignette("GenomeSearching", package="BSgenome")

[Package BSgenome.Hsapiens.UCSC.hg17 version 1.3.11 Index]