crlmm                 package:crlmm                 R Documentation

_G_e_n_o_t_y_p_e _o_l_i_g_o_n_u_c_l_e_o_t_i_d_e _a_r_r_a_y_s _w_i_t_h _C_R_L_M_M

_D_e_s_c_r_i_p_t_i_o_n:

     This is a faster and more efficient implementation of the CRLMM
     algorithm, especially designed for Affymetrix SNP 5 and 6 arrays
     (to be soon extended to other platforms).

_U_s_a_g_e:

     crlmm(filenames, row.names=TRUE, col.names=TRUE,
           probs=c(1/3, 1/3, 1/3), DF=6, SNRMin=5,
           gender=NULL, save.it=FALSE, load.it=FALSE,
           intensityFile, mixtureSampleSize=10^5,
           eps=0.1, verbose=TRUE, cdfName, sns, recallMin=10,
           recallRegMin=1000, returnParams=FALSE, badSNP=0.7)

_A_r_g_u_m_e_n_t_s:

filenames: 'character' vector with CEL files to be genotyped.

row.names: 'logical'. Use rownames - SNP names?

col.names: 'logical'. Use colnames - Sample names?

   probs: 'numeric' vector with priors for AA, AB and BB.

      DF: 'integer' with number of degrees of freedom to use with
          t-distribution.

  SNRMin: 'numeric' scalar defining the minimum SNR used to filter out
          samples.

  gender: 'integer' vector, with same length as 'filenames', defining
          sex. (1 - male; 2 - female)

 save.it: 'logical'. Save preprocessed data?

 load.it: 'logical'. Load preprocessed data to speed up analysis?

intensityFile: 'character' with filename to be saved/loaded -
          preprocessed data.

mixtureSampleSize: Number of SNP's to be used with the mixture model.

     eps: Minimum change for mixture model.

 verbose: 'logical'.

 cdfName: 'character' defining the CDF name to use ('GenomeWideSnp5',
          'GenomeWideSnp6')

     sns: 'character' vector with sample names to be used.

recallMin: Minimum number of samples for recalibration.

recallRegMin: Minimum number of SNP's for regression.

returnParams: 'logical'. Return recalibrated parameters.

  badSNP: 'numeric'. Threshold to flag as bad SNP (affects batchQC)

_V_a_l_u_e:

     A 'SnpSet' object. 

   calls: Genotype calls (1 - AA, 2 - AB, 3 - BB)

   confs: Confidence scores 'round(-1000*log2(1-p))'

   SNPQC: SNP Quality Scores

 batchQC: Batch Quality Score

  params: Recalibrated parameters

_R_e_f_e_r_e_n_c_e_s:

     Carvalho B, Bengtsson H, Speed TP, Irizarry RA. Exploration,
     normalization, and genotype calls of high-density oligonucleotide
     SNP array data. Biostatistics. 2007 Apr;8(2):485-99. Epub 2006 Dec
     22. PMID: 17189563.

     Carvalho B, Louis TA, Irizarry RA. Describing Uncertainty in
     Genome-wide Genotype Calling. (in prep)

_E_x_a_m_p_l_e_s:

     ## this can be slow
     if (require(genomewidesnp5Crlmm) & require(hapmapsnp5)){
       path <- system.file("celFiles", package="hapmapsnp5")

       ## the filenames with full path...
       ## very useful when genotyping samples not in the working directory
       cels <- list.celfiles(path, full.names=TRUE)
       (crlmmOutput <- crlmm(cels))
     }

