MaskCollection-class         package:IRanges         R Documentation

_M_a_s_k_C_o_l_l_e_c_t_i_o_n _o_b_j_e_c_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     The MaskCollection class is a container for storing a collection
     of masks that can be used to mask regions in a sequence.

_D_e_t_a_i_l_s:

     In the context of the Biostrings package, a mask is a set of
     regions in a sequence that need to be excluded from some
     computation. For example, when calling 'alphabetFrequency' or
     'matchPattern' on a chromosome sequence, you might want to exclude
     some regions like the centromere or the repeat regions. This can
     be achieved by putting one or several masks on the sequence before
     calling 'alphabetFrequency' on it.

     A MaskCollection object is a vector-like object that represents
     such set of masks. Like standard R vectors, it has a "length"
     which is the number of masks contained in it. But unlike standard
     R vectors, it also has a "width" which determines the length of
     the sequences it can be "put on". For example, a MaskCollection
     object of width 20000 can only be put on an XString object of
     20000 letters.

     Each mask in a MaskCollection object 'x' is just a finite set of
     integers that are >= 1 and <= 'width(x)'. When "put on" a
     sequence, these integers indicate the positions of the letters to
     mask. Internally, each mask is represented by a NormalIRanges
     object.

_B_a_s_i_c _a_c_c_e_s_s_o_r _m_e_t_h_o_d_s:

     In the code snippets below, 'x' is a MaskCollection object.


      'length(x)': The number of masks in 'x'.

      'width(x)': The common with of all the masks in 'x'. This
          determines the length of the sequences that 'x' can be "put
          on".

      'active(x)': A logical vector of the same length as 'x' where
          each element indicates whether the corresponding mask is
          active or not.

      'names(x)': 'NULL' or a character vector of the same length as
          'x'.

      'desc(x)': 'NULL' or a character vector of the same length as
          'x'.

      'nir_list(x)': A list of the same length as 'x', where each
          element is a NormalIRanges object representing a mask in 'x'.


_C_o_n_s_t_r_u_c_t_o_r:


      'Mask(mask.width, start=NULL, end=NULL, width=NULL)': Return a
          single mask (i.e. a MaskCollection object of length 1) of
          width 'mask.width' (a single integer >= 1) and masking the
          ranges of positions specified by 'start', 'end' and 'width'.
          See the 'IRanges' constructor ('?IRanges') for how 'start',
          'end' and 'width' can be specified. Note that the returned
          mask is active and unnamed.


_O_t_h_e_r _m_e_t_h_o_d_s:

     In the code snippets below, 'x' is a MaskCollection object.


      'isEmpty(x)': Return a logical vector of the same length as 'x',
          indicating, for each mask in 'x', whether it's empty or not.

      'max(x)': The greatest (or last, or rightmost) masked position
          for each mask. This is a numeric vector of the same length as
          'x'.

      'min(x)': The smallest (or first, or leftmost) masked position
          for each mask. This is a numeric vector of the same length as
          'x'.

      'maskedwidth(x)': The number of masked position for each mask.
          This is an integer vector of the same length as 'x' where all
          values are >= 0 and <= 'width(x)'.

      'maskedratio(x)': 'maskedwidth(x) / width(x)'


_S_u_b_s_e_t_t_i_n_g _a_n_d _a_p_p_e_n_d_i_n_g:

     In the code snippets below, 'x' and 'values' are MaskCollection
     objects.


      'x[i]': Return a new MaskCollection object made of the selected
          masks. Subscript 'i' can be a numeric, logical or character
          vector.

      'x[[i, exact=TRUE]]': Extract the mask selected by 'i' as a
          NormalIRanges object. Subscript 'i' can be a single integer
          or a character string.

      'append(x, values, after=length(x))': Add masks in 'values' to
          'x'.


_O_t_h_e_r _m_e_t_h_o_d_s:

     In the code snippets below, 'x' is a MaskCollection object.


      'reduce(x)': Return a MaskCollection object of length 1 made of
          the union (or merging, or collapsing) of all the active masks
          in 'x'.

      'gaps(x)': Invert the masks in 'x'.

      'subseq(x, start=NA, end=NA, width=NA)': If 'y' is a sequence
          that 'x' has been put on top of, then 'subseq' will return
          the set of submasks that go on top of the subsequence
          obtained by calling 'subseq' on 'y' ('subseq' must be called
          on 'x' with the same arguments that have been used when
          called on 'y').


_A_u_t_h_o_r(_s):

     H. Pages

_S_e_e _A_l_s_o:

     NormalIRanges-class, read.Mask, MaskedXString-class,
     'alphabetFrequency', 'reverse', 'matchPattern'

_E_x_a_m_p_l_e_s:

       ## Making a MaskCollection object:
       mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2))
       mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1))
       mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4))
       mymasks <- append(append(mask1, mask2), mask3)
       mymasks
       length(mymasks)
       width(mymasks)
       reduce(mymasks)
       gaps(mymasks)

       ## Names and descriptions:
       names(mymasks) <- c("A", "B", "C")  # names should be short and unique...
       mymasks
       mymasks[c("C", "A")]  # ...to make subsetting by names easier
       desc(mymasks) <- c("you can be", "more verbose", "here")
       mymasks[-2]

       ## Activate/deactivate masks:
       active(mymasks)["B"] <- FALSE
       mymasks
       reduce(mymasks)
       active(mymasks) <- FALSE  # deactivate all masks
       mymasks
       active(mymasks)[-1] <- TRUE  # reactivate all masks except mask 1
       active(mymasks) <- !active(mymasks)  # toggle all masks

       ## Other advanced operations:
       mymasks[[2]]
       length(mymasks[[2]])
       mymasks[[2]][-3]
       append(mymasks[-2], gaps(mymasks[2]))
       mymasks2 <- subseq(mymasks, start=8)
       mymasks2
       mymasks2[[2]]

