import/export            package:GSEABase            R Documentation

_R_e_a_d _a_n_d _w_r_i_t_e _g_e_n_e _s_e_t_s _f_r_o_m _B_r_o_a_d _o_r _G_M_T _f_o_r_m_a_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     'getBroadSets' parses one or more XML files for gene sets. The
     file can reside locally or at a URL. The format followed is that
     defined by the Broad (below).  'toBroadXML' creates Broad XML from
     'BroadCollection' gene sets.

     'toGmt' converts 'GeneSetColletion' objects to a character vector
     representing the gene set collection in GMT format. 'fromGmt'
     reads a file or other character vector into a 'GeneSetColletion'.

_U_s_a_g_e:

     getBroadSets(uri, ...)
     toBroadXML(geneSet, con, ...)
     asBroadUri(name,
                base="http://www.broad.mit.edu/gsea/msigdb/cards")
     getGmt(con, geneIdType=NullIdentifier(),
            collectionType=NullCollection(), sep="\t", ...)
     toGmt(x, con, ...)

_A_r_g_u_m_e_n_t_s:

     uri: A file name or URL containing gene sets encoded following the
          Broad specification. For Broad sets, the uri can point to a
          MSIGDB.

 geneSet: A 'GeneSet' with 'collectionType' 'BroadCollection' (to
          ensure that required information is available).

       x: A 'GeneSetCollection' or other object for which a 'toGmt'
          method is defined.

     con: A (optional, in the case of 'toXxx') file name or connection
          to receive output.

    name: A character vector of Broad gene set names, e.g.,
          'c('chr16q', 'GNF2_TNFSF10')'.

    base: Base uri for finding Broad gene sets.

geneIdType: A constructor for the type of identifier the members of the
          gene sets represent. See 'GeneIdentifierType' for more
          information.

collectionType: A constructor for the type of collection for the gene
          sets. See 'CollectionType' for more information.

     sep: The character string separating members of each gene set in
          the GMT file.

     ...: Further arguments passed to the underlying XML parser,
          particularly 'file' used to specify an output 'connection'
          for 'toBroadXML'.

_V_a_l_u_e:

     'getBroadSets' returns a 'GeneSetCollection' of gene sets.

     'toBroadXML' returns a character vector of a single 'GeneSet' or,
     if 'con' is provided, writes the XML to a file.

     'asBroadUri' can be used to create URI names (to be used by
     'getBroadSets' of Broad files.

     'getGmt' returns a 'GeneSetCollection' of gene sets.

     'toGmt' returns character vectors where each line represents a
     gene set. If 'con' is provided, the result is written to the
     specified connection.

_N_o_t_e:

     Actual Broad XML files differ from the DTD (e.g., an implied ','
     separator between genes in a set); we parse to and from files as
     they exists the actual files.

_A_u_t_h_o_r(_s):

     Martin Morgan <mtmrogan@fhcrc.org>

_R_e_f_e_r_e_n_c_e_s:

     <URL: http://www.broad.mit.edu/gsea/>

_S_e_e _A_l_s_o:

     'GeneSetCollection' 'GeneSet'

_E_x_a_m_p_l_e_s:

     ## 'fl' could also be a URI
     fl <- system.file("extdata", "Broad.xml", package="GSEABase")
     gss <- getBroadSets(fl) # GeneSetCollection of 2 sets
     names(gss)
     gss[[1]]

     ## Not run: 
     ## Download from the Broad
     getBroadSets(asBroadUri(c('chr16q', 'GNF2_ZAP70')))
     ## End(Not run)

     fl <- tempfile()
     toBroadXML(gss[[1]], con=fl)
     noquote(readLines(fl))
     unlink(fl)

     ## Not run: 
     toBroadXML(gss[[1]]) # character vector
     ## End(Not run)

     fl <- tempfile()
     toGmt(gss, fl)
     getGmt(fl)
     unlink(fl)

