prune              package:GeneticsPed              R Documentation

_P_r_u_n_e _p_e_d_i_g_r_e_e

_D_e_s_c_r_i_p_t_i_o_n:

     'prune' removes noninformative individuals from a pedigree. This
     process is usually called trimming or pruning. Individuals are
     removed if they do not provide any ancestral ties between other
     individuals. It is possible to add some additional criteria. See
     details.

_U_s_a_g_e:

     prune(x, id, father, mother, unknown=NA, testAdd=NULL, verbose=FALSE)

_A_r_g_u_m_e_n_t_s:

       x: data.frame, pedigree data

      id: character, individuals's identification column name

  father: character, father's identification column name

  mother: character, mother's identification column name

 unknown: value(s) used for representing unknown parent in 'x'

 testAdd: logical, additional criteria; see details

 verbose: logical, print some more info

_D_e_t_a_i_l_s:

     NOTE: this function does not yet work with Pedigree class.

     There are always some individuals in the pedigree that jut out.
     Usually this are older individuals without known ancestors,
     founders. If such individuals have only one (first) descendant and
     no phenotype/genotype data, then they do not give us any
     additional information and can be safely removed from the
     pedigree. This process resembles cutting/pruning the branches of a
     tree.

     By default 'prune' iteratively removes individuals from the
     pedigree (from top to bottom) if:

        *  they are founders, have both ancestors i.e. father and
           mother unknown and

        *  have only one or no (first) descendants i.e. children

     If there is a need to take into account availability of say
     phenotype/genotype data or any other information, argument
     'testAdd' can be used. Value of this argument must be logical and
     with length equal to number of rows in the pedigree. The easiest
     way to achieve this is to 'merge' any data to the pedigree and
     then to perform a test, which will return logical values. Note
     that value of 'TRUE' in 'testAdd' means to remove an individual -
     this function is removing individuals! To keep an individual
     without known parents and one or no children, value of 'testAdd'
     must be 'FALSE' for that particular individual. Take a look at the
     examples.

     There are various conventions on representing unknown/missing
     ancestors, say 0. R's default is to use 'NA'. If other values than
     'NA' are present, argument 'unknown' can be used to convert
     unknown/missing values to 'NA'.

     It is assumed that pedigree is in extended form i.e. that each
     father and mother has each own record as an individual. Otherwise
     error is returned with information on which parents do not appear
     as individuals.

     'prune' does not only remove lines for pruned individuals but also
     removes them from 'father' and 'mother' columns.

     Pruning is done from top to bottom of the pedigree i.e. from
     oldest individuals towards younger ones. Take for example the
     following part of the pedigree in example section:


          0   7
          |   |
          -----
            |
       10   8
        |   |
        -----
          |
          9

     Individual 7 is not removed since it has two (first) descendants
     i.e. 8 and 5 (not shown here). Consecutively, individuals 8 and 9
     are also not removed from the pedigree. Individual 10 is removed,
     since it has only one descendant. Why should individuals 8 and 9
     and therefore also 7 stay in the pedigree? Current behaviour is
     reasonable if pedigree is built in such a way that first
     individuals with some phenotype or genotype data are gathered and
     then their pedigree is being built. Say, individual 9 has
     pehnotype/genotype data and its pedigree is build and there is
     therefore no need to remove such an individual. However, if
     pedigree is not built in such a way, then 'prunPedigree' function
     can not prune all noninformative individuals. Argument 'testAdd'
     can not help with this issue, since basic tests (founder and one
     or no first descendants) and 'testAdd' are combined with '&'.

_V_a_l_u_e:

     'prune' returns a data.frame with possibly fewer individuals. Read
     also the details.

_A_u_t_h_o_r(_s):

     Gregor Gorjanc

_S_e_e _A_l_s_o:

     'Pedigree'

_E_x_a_m_p_l_e_s:

       ## Pedigree example
       x <- data.frame(oseba=c(1,  9, 11, 2, 3, 10, 8, 12, 13,  4, 5, 6, 7, 14, 15, 16, 17),
                         oce=c(2, 10, 12, 5, 5,  0, 7,  0,  0,  0, 7, 0, 0,  0,  0,  0,  0),
                        mama=c(3,  8, 13, 0, 4,  0, 0,  0,  0, 14, 6, 0, 0, 15, 16, 17,  0),
                        spol=c(2,  2,  2, 1, 2,  1, 2,  1,  2,  2, 1, 2, 1,  1,  1,  1,  1),
                  generacija=c(1,  1,  1, 2, 2,  2, 2,  2,  2,  3, 3, 4, 4,  5,  6,  7,  8),
                        last=c(2, NA,  8, 4, 1,  6,NA, NA, NA, NA,NA,NA,NA, NA, NA, NA, NA))

       ## Default case
       prune(x=x, id="oseba", father="oce", mother="mama", unknown=0)

       ## Use of additional test i.e. do not remove individual if it has
       ## known value for "last"
       prune(x=x, id="oseba", father="oce", mother="mama", unknown=0,
                     testAdd=is.na(x$last))

       ## Use of other data
       y <- data.frame(oseba=c( 11,  15, 16),
                       last2=c(8.5, 7.5, NA))

       x <- merge(x=x, y=y, all.x=TRUE)
       prune(x=x, id="oseba", father="oce", mother="mama", unknown=0,
                     testAdd=is.na(x$last2))

