GeneSet-class            package:GSEABase            R Documentation

_C_l_a_s_s "_G_e_n_e_S_e_t"

_D_e_s_c_r_i_p_t_i_o_n:

     A 'GeneSet' contains a set of gene identifiers. Each gene set has
     a 'geneIdType', indicating how the gene identifiers should be
     interpreted (e.g., as Entrez identifiers), and a 'collectionType',
     indicating the origin of the gene set (perhaps including
     additional information about the set, as in the 'BroadCollection'
     type).

     Conversion between identifiers, subsetting, and logical (set)
     operations can be performed. Relationships between genes and
     phenotype in a 'GeneSet' can be summarized using 'coloring' to
     create a 'GeneColorSet'. A 'GeneSet' can be exported to XML with
     'toBroadXML'.

_O_b_j_e_c_t_s _f_r_o_m _t_h_e _C_l_a_s_s:

     Construct a 'GeneSet' with a 'GeneSet' method (e.g., from a
     character vector of gene names, or an 'ExpressionSet'), or from
     gene sets stored as XML (locally or on the internet; see
     'getBroadSets')

_S_l_o_t_s:

     '_s_e_t_N_a_m_e': Object of class '"ScalarCharacter"' containing a short
          name (single word is best) to identify the set.

     '_s_e_t_I_d_e_n_t_i_f_i_e_r': Object of class '"ScalarCharacter"' containing a
          (unique) identifier for the set.

     '_g_e_n_e_I_d_T_y_p_e': Object of class '"GeneIdentifierType"' containing
          information about how the gene identifiers are encoded. See
          'GeneIdentifierType' and related classes.

     '_g_e_n_e_I_d_s': Object of class '"character"' containing the gene
          symbols.

     '_c_o_l_l_e_c_t_i_o_n_T_y_p_e': Object of class '"CollectionType"' containing
          information about how the geneIds were collected, including
          perhaps additional information unique to the collection
          methodology. See 'CollectionType' and related classes.

     '_s_h_o_r_t_D_e_s_c_r_i_p_t_i_o_n': Object of class '"ScalarCharacter"'
          representing short description (1 line) of the gene set.

     '_l_o_n_g_D_e_s_c_r_i_p_t_i_o_n': Object of class '"ScalarCharacter"' providing a
          longer description (e.g., like an abstract) of the gene set.

     '_o_r_g_a_n_i_s_m': Object of class '"ScalarCharacter"' represents the
          organism the gene set is derived from.

     '_p_u_b_M_e_d_I_d_s': Object of class '"character"' containing PubMed ids
          related to the gene set.

     '_u_r_l_s': Object of class '"character"' containing urls used to
          construct or manipulate the gene set.

     '_c_o_n_t_r_i_b_u_t_o_r': Object of class '"character"' identifying who
          created the gene set.

     '_v_e_r_s_i_o_n': Object of class '"Versions"' a version number, manually
          curated (i.e., by the 'contributor') to provide a consistent
          way of tracking a gene set.

     '_c_r_e_a_t_i_o_n_D_a_t_e': Object of class '"character"' containing the
          character string representation of the date on which the gene
          set was created.

_M_e_t_h_o_d_s:

     Gene set construction:

     _G_e_n_e_S_e_t See 'GeneSet' methods and 'getBroadSets' for convenient
          construction.

     Slot access (e.g., 'setName') and retrieve (e.g., 'setName<-') :

     _c_o_l_l_e_c_t_i_o_n_T_y_p_e<- 'signature(object = "GeneSet", value =
          "CollectionType")'

     _c_o_l_l_e_c_t_i_o_n_T_y_p_e 'signature(object = "GeneSet")'

     _c_o_n_t_r_i_b_u_t_o_r<- 'signature(object = "GeneSet", value = "character")'

     _c_o_n_t_r_i_b_u_t_o_r 'signature(object = "GeneSet")'

     _c_r_e_a_t_i_o_n_D_a_t_e<- 'signature(object = "GeneSet", value =
          "character")'

     _c_r_e_a_t_i_o_n_D_a_t_e 'signature(object = "GeneSet")'

     _d_e_s_c_r_i_p_t_i_o_n<- 'signature(object = "GeneSet", value = "character")'

     _d_e_s_c_r_i_p_t_i_o_n 'signature(object = "GeneSet")'

     _g_e_n_e_I_d_s<- 'signature(object = "GeneSet", value = "character")'

     _g_e_n_e_I_d_s 'signature(object = "GeneSet")'

     _l_o_n_g_D_e_s_c_r_i_p_t_i_o_n<- 'signature(object = "GeneSet", value =
          "character")'

     _l_o_n_g_D_e_s_c_r_i_p_t_i_o_n 'signature(object = "GeneSet")'

     _o_r_g_a_n_i_s_m<- 'signature(object = "GeneSet", value = "character")'

     _o_r_g_a_n_i_s_m 'signature(object = "GeneSet")'

     _p_u_b_M_e_d_I_d_s<- 'signature(object = "GeneSet", value = "character")'

     _p_u_b_M_e_d_I_d_s 'signature(object = "GeneSet")'

     _s_e_t_d_i_f_f 'signature(x = "GeneSet", y = "GeneSet")'

     _s_e_t_I_d_e_n_t_i_f_i_e_r<- 'signature(object = "GeneSet", value =
          "character")'

     _s_e_t_I_d_e_n_t_i_f_i_e_r 'signature(object = "GeneSet")'

     _s_e_t_N_a_m_e<- 'signature(object = "GeneSet", value = "character")'

     _s_e_t_N_a_m_e 'signature(object = "GeneSet")'

     _g_e_n_e_I_d_T_y_p_e<- 'signature(object = "GeneSet", verbose=FALSE, value =
          "character")', 'signature(object = "GeneSet", verbose=FALSE,
          value = "GeneIdentifierType")': These method attempt to
          coerce geneIds from the current type to the type named by
          'value'. Successful coercion requires an appropriate method
          for 'mapIdentifiers'.

     _g_e_n_e_I_d_T_y_p_e 'signature(object = "GeneSet")'

     _s_e_t_V_e_r_s_i_o_n<- 'signature(object = "GeneSet", value = "Versions")'

     _s_e_t_V_e_r_s_i_o_n 'signature(object = "GeneSet")'

     _u_r_l_s<- 'signature(object = "GeneSet", value = "character")'

     _u_r_l_s 'signature(object = "GeneSet")'

     Logical and subsetting operations:

     _u_n_i_o_n 'signature(x = "GeneSet", y = "GeneSet")': ... 

     | 'signature(e1 = "GeneSet", e2 = "GeneSet")': calculate the
          logical `or' (union) of two gene sets. The sets must contain
          elements of the same 'geneIdType'.

     | 'signature(e1 = "GeneSet", e2 = "character")', 'signature(e1 =
          "character", e2 = "GeneSet")': calculate the logical `or'
          (union) of a gene set and a character vector, i.e., add the
          geneIds named in the character vector to the gene set.

     _i_n_t_e_r_s_e_c_t 'signature(x = "GeneSet", y = "GeneSet")':

     & 'signature(e1 = "GeneSet", e2 = "GeneSet")': calculate the
          logical `and' (intersection) of two gene sets.

     & 'signature(e1 = "GeneSet", e2 = "character")', 'signature(e1 =
          "character", e2 = "GeneSet")': calculate the logical `and'
          (intersection) of a gene set and a character vector, creating
          a new gene set containing only those genes named in the
          character vector.

     _s_e_t_d_i_f_f 'signature(x = "GeneSet", y = "GeneSet")', 'signature(x =
          "GeneSet", y = "character")', 'signature(x = "character", y =
          "GeneSet")': calculate the logical set difference betwen two
          gene sets, or betwen a gene set and a character vector.

     [ 'signature(x = "GeneSet", i="character")' 'signature(x =
          "GeneSet", i="numeric")': subset the gene set by index
          ('i="numeric"') or value ('i="character"'). Genes are
          re-ordered as required

     [ 'signature(x = "ExpressionSet", i = "GeneSet")': subset the
          expression set, using genes in the gene set to select
          features. Genes in the gene set are coerced to appropriate
          annotation type if necessary (by consulting the 'annotation'
          slot of the expression set, and using 'geneIdType<-').

     [[ 'signature(x = "GeneSet")': select a single gene from the gene
          set.

     $ 'signature(x = "GeneSet")': select a single gene from the gene
          set, allowing partial matching.

     Useful additional methods include:

     _G_e_n_e_C_o_l_o_r_S_e_t 'signature(type = "GeneSet")': create a 'color' gene
          set from a 'GeneSet', containing information about phenotype.
          This method has a required argument 'phenotype', a character
          string describing the phenotype for which color is available.
          See 'GeneColorSet'.

     _m_a_p_I_d_e_n_t_i_f_i_e_r_s Use the code in the examples to list available
          methods. These convert genes from one 'GeneIdentifierType' to
          another. See 'mapIdentifiers' and specific methods in
          'GeneIdentifierType' for additional detail.

     _i_n_c_i_d_e_n_c_e Summarize shared membership in genes across gene sets.
          See 'incidence-methods'.

     _s_h_o_w 'signature(object = "GeneSet")': display a short summary of
          the gene set.

     _d_e_t_a_i_l_s 'signature(object = "GeneSet")': display additional
          information about the gene set. See 'details'.

     _i_n_i_t_i_a_l_i_z_e 'signature(.Object = "GeneSet")': Used internally
          during gene set construction.

_A_u_t_h_o_r(_s):

     Martin Morgan <mtmorgan@fhcrc.org>

_S_e_e _A_l_s_o:

     'GeneColorSet' 'CollectionType' 'GeneIdentifierType'

_E_x_a_m_p_l_e_s:

     ## Empty gene set
     GeneSet()
     ## Gene set from ExpressionSet
     data(sample.ExpressionSet)
     gs1 <- GeneSet(sample.ExpressionSet[100:109])
     ## GeneSet from Broad XML; 'fl' could be a url
     fl <- system.file("extdata", "Broad.xml", package="GSEABase")
     gs2 <- getBroadSets(fl)[[1]] # actually, a list of two gene sets
     ## GeneSet from list of geneIds
     geneIds <- geneIds(gs2) # any character vector would do
     gs3 <- GeneSet(geneIds=geneIds)
     ## unspecified set type, so...
     is(geneIdType(gs3), "NullIdentifier") == TRUE
     ## update set type to match encoding of identifiers
     geneIdType(gs2)
     geneIdType(gs3) <- SymbolIdentifier()

     ## Convert between set types; this consults the 'annotation'
     ## information encoded in the 'AnnotationIdentifier' set type and the
     ## corresponding annotation package.
     ## Not run: 
     gs4 <- gs1
     geneIdType(gs4) <- EntrezIdentifier()
     ## End(Not run)

     ## logical (set) operations
     gs5 <- GeneSet(sample.ExpressionSet[100:109], setName="subset1")
     gs6 <- GeneSet(sample.ExpressionSet[105:114], setName="subset2")
     ## intersection: 5 'genes'; note the set name '(subset1 & subset2)'
     gs5 & gs6
     ## union: 15 'genes'; note the set name
     gs5 | gs6
     ## an identity
     gs7 <- gs5 | gs6
     gs8 <- setdiff(gs5, gs6) | (gs5 & gs6) | setdiff(gs6, gs5)
     identical(geneIds(gs7), geneIds(gs8))
     identical(gs7, gs8) == FALSE # gs7 and gs8 setNames differ

     ## output
     tmp <- tempfile()
     toBroadXML(gs2, tmp)
     noquote(readLines(tmp))
     ## must be BroadCollection() collectionType 
     try(toBroadXML(gs1))
     gs9 <- gs1
     collectionType(gs9) <- BroadCollection()
     toBroadXML(gs9, tmp)
     unlink(tmp)
     toBroadXML(gs9) # no connection --> character vector
     ## list of geneIds --> vector of Broad GENESET XML
     gs10 <- getBroadSets(fl) # two sets
     entries <- sapply(gs10, function(x) toBroadXML(x)[[2]])

     ## list mapIdentifiers available for GeneSet
     showMethods("mapIdentifiers", classes="GeneSet", inherit=FALSE)

