simLL                package:GOstats                R Documentation

_F_u_n_c_t_i_o_n_s _t_o _c_o_m_p_u_t_e _s_i_m_i_l_a_r_i_t_i_e_s _b_e_t_w_e_e_n _G_O _g_r_a_p_h_s _a_n_d _a_l_s_o
_b_e_t_w_e_e_n _E_n_t_r_e_z _G_e_n_e _I_D_s _b_a_s_e_d _o_n _t_h_e_i_r _i_n_d_u_c_e_d _G_O _g_r_a_p_h_s.

_D_e_s_c_r_i_p_t_i_o_n:

     Both 'simUI' and 'simLP' compute a similarity measure between two
     GO graphs. For 'simLL', first the induced GO graph for each of its
     arguments is found and then these are passed to one of 'simUI' or
     'simLP'.

_U_s_a_g_e:

     simLL(ll1, ll2, Ontology = "MF", measure = "LP", dropCodes = NULL,
           mapfun = NULL, chip = NULL)
     simUI(g1, g2)
     simLP(g1, g2)

_A_r_g_u_m_e_n_t_s:

     ll1: A Entrez Gene ID as a character vector. 

     ll2: A Entrez Gene ID as a character vector.

Ontology: Which ontology to use ("MF", "BP", "CC"). 

 measure: Which measure to use ("LP", "UI"). 

dropCodes: A set of evidence codes to be ignored in constructing the
          induced GO graphs. 

  mapfun: A function taking a character vector of Entrez Gene IDs as
          its only argument and returning a list of "GO lists" matching
          the structure of the lists in the GO maps of annotation data
          packages. The function should behave similarly to 'mget(x,
          eg2gomap, ifnotfound=NA)', that is, 'NA' should be returned
          if a specified Entrez ID has no GO mapping.  See details for
          the interaction of 'mapfun' and 'chip'.

    chip: The name of a DB-based annotation data package (the name will
          end in ".db").  This package will be used to generate an
          Entrez ID to GO ID mapping instead of 'mapfun'.

      g1: An instance of the 'graph' class.

      g2: An instance of the 'graph' class.

_D_e_t_a_i_l_s:

     For each of 'll1' and 'll2' the set of most specific GO terms
     within the ontology specified ('Ontology') that are not based on
     any excluded evidence code ('dropCodes') are found.  The mapping
     is achieved in one of three ways:


        1.  If 'mapfun' is provided, it will be used to perform the
           needed lookups.  In this case, 'chip' will be ignored.

        2.  If 'chip' is provided and 'mapfun=NULL', then the needed
           lookups will be done based on the Entrez to GO mappings
           encapsulated in the specified annotation data package.  This
           is the recommended usage.

        3.  If 'mapfun' and 'chip' are 'NULL' or missing, then the
           function will attempt to load the GO package (the
           environment-based package, distinct from GO.db).  This
           package contains a legacy environment mapping Entrez IDs to
           GO IDs.  If the GO package is not available, an error will
           be raised. Omitting both 'mapfun' and 'chip' is not
           recommended as it is not compatible with the DB-based
           annotation data packages.

     Next, the induced GO graphs are computed.

     Finally these graphs are passed to one of 'simUI', (union
     intersection), or 'simLP' (longest path). For 'simUI' the distance
     is the size of the intersection of the node sets divided by the
     size of the union of the node sets. Large values indicate more
     similarity. These similarities are between 0 and 1.

     For 'simLP' the length of the longest path in the intersection
     graph of the two supplied graph. Again, large values indicate more
     similarity. Similarities are between 0 and the maximum leaf depth
     of the graph for the specified ontology.

_V_a_l_u_e:

     A list with: 

    sim : The numeric similarity measure.

 measure: Which measure was used.

      g1: The graph induced by 'll1'.

      g2: The graph induced by 'll2'.


     If one of the supplied Gene IDs does not have any GO terms
     associated with it, in the selected ontology and with the selected
     evidence codes then 'NA' is returned.

_A_u_t_h_o_r(_s):

     R. Gentleman

_S_e_e _A_l_s_o:

     'makeGOGraph'

_E_x_a_m_p_l_e_s:

       library("hgu95av2.db")
       eg1 = c("9184", "3547")

       bb = simLL(eg1[1], eg1[2], "BP", chip="hgu95av2.db")

