LL2homology             package:annotate             R Documentation

_F_u_n_c_t_i_o_n_s _t_h_a_t _f_i_n_d _t_h_e _h_o_m_o_l_o_g_y _d_a_t_a _f_o_r _a _g_i_v_e_n _s_e_t _o_f
_L_o_c_u_s_L_i_n_k _i_d_s _o_r _H_o_m_o_l_o_G_e_n_e_I_D_s

_D_e_s_c_r_i_p_t_i_o_n:

     Given a set of LocusLink ids or NCBI HomoloGeneIDs, the functions
     obtain the homology data and represent them as a list of sub-lists
     using the homology data package for the organism of interest. A
     sub-list can be of length 1 or greater depending on whether a
     LocusLink id can be mapped to one or more HomoloGeneIDs.

_U_s_a_g_e:

     LL2homology(homoPkg, llids)
     HGID2homology(hgid, homoPkg)
     ACC2homology(accs, homoPkg)

_A_r_g_u_m_e_n_t_s:

   llids: 'llids' a vector of character strings or numberic numbers for
          a set of LocusLink ids whose homologous genes in other
          organisms are to be found

    hgid: 'hgid' a named vector of character strings or numberic
          numbers for a set of HomoloGeneIDs whose homologous genes in
          other organisms are to be found. Names of the vector give the
          code used by NCBI for organisms

    accs: 'accs' a vector of character strings for a set of GenBank
          Accession numbers

 homoPkg: 'homoPkg' a character string for the name of the homology
          data package for a given organism, which is a short version
          of the scientific name of the organism plus homology (e. g.
          hsahomology)

_D_e_t_a_i_l_s:

     The homology data package has to be installed before executing any
     of the two functions.

     Each sub-list  has the following elements:

     homoOrg - a named vector of a single character string whose value
     is the scientific name of the organism and name the numeric code
     used by NCBI for the organism.

     homoLL - an integer for LocusLink id.

     homoHGID - an integer for internal HomoloGeneID.

     homoACC - a character string for GenBank accession number of the
     best matching sequence of the organism.

     homoType - a single letter for the type of similarity measurement
     between the homologous genes. homoType can be either B (reciprocal
     best best between three or more organisms), b (reciprocal best
     match between two organisms), or c (curated homology relationship
     between two organisms).

     homoPS - a percentage value measured as the percent of identity of
     base pair alignment between the homologous sequences. 

     homoURL - a url to the source if the homology relationship is a
     curated orthology.

     Sub-lists with homoType = B or b will not have any value for
     homoURL and objects with homoType = c will not have any value for
     homoPS.

_V_a_l_u_e:

     Both functions returns a list of sub-lists containing data for
     homologous genes in other organisms.

_A_u_t_h_o_r(_s):

     Jianhua Zhang

_R_e_f_e_r_e_n_c_e_s:

     <URL: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?=homologene>

_E_x_a_m_p_l_e_s:

       if(require("hsahomology")){
           llids <- ls(env = hsahomologyLL2HGID)[2:5]
           LL2homology("hsahomology", llids)
       }

