dist2              package:affyQCReport              R Documentation

_C_a_l_c_u_l_a_t_e _a_n _n-_b_y-_n _m_a_t_r_i_x _b_y _a_p_p_l_y_i_n_g _a _f_u_n_c_t_i_o_n _t_o 
_p_a_i_r_s _o_f _c_o_l_u_m_n_s _o_f _a_n _m-_b_y-_n _m_a_t_r_i_x.

_D_e_s_c_r_i_p_t_i_o_n:

     Calculate an n-by-n matrix by applying a function to  pairs of
     columns of an m-by-n matrix.

_U_s_a_g_e:

       dist2(x, fun=function(a,b) mad(a-b))

_A_r_g_u_m_e_n_t_s:

       x: A matrix, or indeed any object for which 'ncol(x)' and
          'x[,j]' return useful results.

     fun: A symmetric function of two arguments that may be columns of
          'x'.

_D_e_t_a_i_l_s:

     With the default value of 'fun', this calculates a matrix of the
     'mad' of all pairwise differences of columns of 'x'. This may be
     considered a measure of distance.

     The implementation assumes that 'fun' is symmetric,
     'fun(a,b)=fun(b,a)'.

     A use for this function is the detection of outlier arrays in a
     microarray experiment. Assume that each column of 'x' can be
     decomposed as $z+beta+epsilon$, where $z$ is a fixed vector (the
     same for all columns), $epsilon$ is vector of 'nrow{x}' i.i.d.
     random numbers, and $beta$ is an arbitrary vector whose majority
     of entries are negligibly small (i.e. close to zero). In other
     words, $z$ the probe effects,  $epsilon$ measurement noise and
     $beta$ differential expression effects. Under this assumption, all
     entries of the resulting distance matrix should be the same,
     namely $\sqrt(2)$ times the standard deviation of $epsilon$.
     Arrays whose  distance matrix entries are way different give cause
     for suspicion.

_V_a_l_u_e:

     A symmetric matrix of size 'n x n'. Diagonal elements are 'NA'.

_E_x_a_m_p_l_e_s:

       z = matrix(rnorm(15693), ncol=3)
       dist2(z)

