llsImpute         package:pcaMethods         R Documentation(latin1)

_L_L_S_i_m_p_u_t_e _a_l_g_o_r_i_t_h_m

_D_e_s_c_r_i_p_t_i_o_n:

     Missing value estimation using local least squares (LLS). First, k
     variables (for Microarrya data usually the genes)  are selected by
     pearson, spearman or kendall correlation coefficients. Then
     missing values are imputed by a linear combination of the k
     selected variables. The optimal combination is found by LLS
     regression. The method was first described by Kim et al,
     Bioinformatics, 21(2),2005.

     Missing values are denoted as 'NA'

     It is not recommended to use this function directely but rather to
     use the nni() wrapper function.

_U_s_a_g_e:

       llsImpute(Matrix, k = 10, center = FALSE, completeObs = TRUE, correlation = "pearson",
       allVariables = FALSE, maxSteps = 100, xval = NULL, verbose = interactive(), ...)

_A_r_g_u_m_e_n_t_s:

  Matrix: 'matrix' - Data containing the variables (genes) in columns
          and observations (samples) in rows. The data may contain
          missing values, denoted as 'NA'.

       k: 'numeric' - Cluster size, this is the number of similar genes
          used for regression.

  center: 'boolean' - Mean center the data if TRUE

completeObs: 'boolean' - Return the estimated complete observations if 
          TRUE. This is the input data with NA values replaced by the
          estimated values.

correlation: 'character' - How to calculate the distance between genes.
           One out of pearson | kendall | spearman , see also
          help("cor").

allVariables: 'boolean' - Use only complete genes to do the regression
          if TRUE, all genes if FALSE.

maxSteps: 'numeric' - Maximum number of iteration steps if allGenes =
          TRUE.

    xval: 'numeric' Use LLSimpute for cross validation. xval is the
          index of the gene  to estimate, all other incomplete genes
          will be ignored if this parameter is set. We do not consider
          them in the cross-validation anyway...

 verbose: 'boolean' - Print step number and relative change if TRUE and
           allVariables = TRUE

     ...: Reserved for parameters used in future version of the
          algorithm

_D_e_t_a_i_l_s:

     The methods provides two ways for missing value estimation,
     selected by the 'allVariables' option. The first one is to use
     only complete variables for the  regression. This is preferable
     when the number of incomplete variables is relatively small.

     The second way is to consider all variables as candidates for the
     regression. Hereby missing values are initially replaced by the
     columns wise mean.  The method then iterates, using the current
     estimate as input for the regression until the change between new
     and old estimate falls below a threshold (0.001).

     *Complexity:* Each step the generalized inverse of a 'miss' x {k}
     matrix is calculated. Where 'miss' is the number of missing values
     in variable j and 'k' the number of neighbours. This may be slow
     for large values of k and / or many missing values. See also
     help("ginv").

_V_a_l_u_e:

  nniRes: Standard nni (nearest neighbour imputation) result object of
          this package. See 'nniRes' for details.

_A_u_t_h_o_r(_s):

     Wolfram Stacklies 
      MPG/CAS Partner Institute for Computational Biology, Shanghai,
     P.R. China 
      wolfram.stacklies@gmail.com 

_R_e_f_e_r_e_n_c_e_s:

     Kim, H. and Golub, G.H. and Park, H. - Missing value estimation
     for DNA microarray gene expression data: local least squares
     imputation. _Bioinformatics, 2005; 21(2):187-198._

     Troyanskaya O. and Cantor M. and Sherlock G. and Brown P. and
     Hastie T. and Tibshirani R. and Botstein D. and Altman RB. -
     Missing value estimation methods for DNA microarrays.
     _Bioinformatics. 2001 Jun;17(6):520-525._

_S_e_e _A_l_s_o:

     'pca, nniRes, nni'.

_E_x_a_m_p_l_e_s:

     ## Load a sample metabolite dataset (metaboliteData) with already 5% of
     ## data missing
     data(metaboliteData)

     ## Perform llsImpute using k = 10
     ## Set allVariables TRUE because there are very few complete variables
     result <- llsImpute(metaboliteData, k = 10, correlation = "pearson", allVariables = TRUE)

     ## Get the estimated complete observations
     cObs <- result@completeObs

