readGEOAnn             package:annotate             R Documentation

_F_u_n_c_t_i_o_n _t_o _e_x_t_r_a_c_t _d_a_t_a _f_r_o_m _t_h_e _G_E_O _w_e_b _s_i_t_e

_D_e_s_c_r_i_p_t_i_o_n:

     Data files that are available at GEO web site are identified by
     GEO accession numbers. Given the url for the CGI script at GEO and
     a GEO accession number, the functions extract data from the web
     site and returns a matrix containing the data.

_U_s_a_g_e:

     readGEOAnn(GEOAccNum, url = "http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?")
     readIDNAcc(GEOAccNum, url = "http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?")
     getGPLNames(url ="http://www.ncbi.nlm.nih.gov/geo/query/browse.cgi?") 
     getSAGEFileInfo(url =
                            "http://www.ncbi.nlm.nih.gov/geo/query/browse.cgi?view=platforms&prtype=SAGE&dtype=SAGE")
     getSAGEGPL(organism = "Homo sapiens", enzyme = c("NlaIII", "Sau3A"))
     readUrl(url)

_A_r_g_u_m_e_n_t_s:

     url: 'url' the url for the CGI script at GEO

GEOAccNum: 'GEOAccNum' a character string for the GEO accession number
          of a desired file (e. g. GPL97)

organism: 'organism' a character string for the name of the organism of
          interests

  enzyme: 'enzyme' a character string that can be eighter NlaII or
          Sau3A for the enzyme used to create SAGE tags

_D_e_t_a_i_l_s:

     'url' is the CGI script that processes user's request.
     'readGEOAnn' invokes the CGI by passing a GEO  accession number
     and then processes the data file obtained.

     'readIDNAcc' calls 'readGEOAnn' to read the data and the extracts
     the columns for probe ids and accession numbers. The 'GEOAccNum'
     has to be the id for an Affymetrix chip.

     'getGPLNames' parses the html file that lists GEO accession
     numbers and descriptions of the array represented by the
     corresponding GEO accession numbers.

_V_a_l_u_e:

     Both 'readGEOAnn' and 'readIDNAcc' return a matrix.

     'getGPLNames' returns a named vector of the names of commercial
     arrays. The names of the vector are the corresponding GEO
     accession number.

_A_u_t_h_o_r(_s):

     Jianhua Zhang

_R_e_f_e_r_e_n_c_e_s:

     <URL: www.ncbi.nlm.nih.gov/geo>

_E_x_a_m_p_l_e_s:

     # Get array names and GEO accession numbers
     #geoAccNums <- getGPLNames()
     # Read the annotation data file for HG-U133A which is GPL96 based on
     # examining geoAccNums 
     #temp <- readGEOAnn(GEOAccNum = "GPL96")
     #temp2 <- readIDNAcc(GEOAccNum = "GPL96")

