multiscan             package:multiscan             R Documentation

_C_o_m_b_i_n_i_n_g _m_u_l_t_i_p_l_e _l_a_s_e_r _s_c_a_n_s _o_f _m_i_c_r_o_a_r_r_a_y _d_a_t_a

_D_e_s_c_r_i_p_t_i_o_n:

     Estimates gene expressions from multiple laser scans of
     microarrays using non-linear  functional regression model with
     additive plus multiplicative errors.

_U_s_a_g_e:

     multiscan(data, initial = NULL, na.rm = TRUE, verbose = FALSE, control = list())

_A_r_g_u_m_e_n_t_s:

    data: A numeric matrix or data frame containing the intensity data 
          of a single microarray scanned at multiple (two or more)
          scanner settings. For dual channel arrays, 'multiscan' should
          be run on each channel of data separately. The number of rows
          ('n') is equal to the number of spots/probes on the array, 
          and the number of columns ('m') equals the number of scans.
          Columns will be arranged in order of scanner's sensitivity
          before fitting the model. Replicated  probes on the array 
          are treated as individual spots.

 initial: A vector of length 'm+2' to be used as initial values for 
          the scanning effects $(beta_2,\cdots, beta_m)$ and scale 
          $(sigma_1, sigma_2, \nu)$  parameters. If it is 'NULL'
          (default), the  initial values are determined from the
          'data'.

   na.rm: Logical. Should missing values be removed? Defaults to
          'TRUE'.

 verbose: Logical. If 'TRUE', some intermediate results are printed  as
          the iteration proceeds.

 control: A list of control parameters. See Details.

_D_e_t_a_i_l_s:

     The function implements the method of Khondoker _et. al._ (2006) 
     for combining multiple laser scans of microarrays. This function
     is  computationally slow and memory-intensive. That is due to the
     nested iteration loops of the numerical optimization of the
     likelihood function involving a large number ($n+m+2$) of
     parameters. The  optimization uses an alternating algorithm with
     the Nelder-Mead simplex method (Nelder and Mead, 1965) in the
     inner loops. The function 'multiscan' directly uses the C function
     'nmmin', the internal code used in the  general-purpose
     optimization tool 'optim', for implementing the Nelder-Mead
     simplex method. For large data sets with many tens of thousands of
     probes, it is recommended to consider first fitting the model
     using a random subset (e.g. 10,000 rows) of the data matrix, and
     then using the  estimated scanning effects and scale parameters
     obtained as initial values for fitting the model to the full data
     set.

     The 'control' is a list of arguments. The users can change/supply
     any of the following components:

     '_t_r_a_c_e' Indicator ('0' or '1') of tracing information of 
          Nelder-Mead algorithm. If '1', tracing information on the
          progress of the  optimization is produced. Because
          Nelder-Mead may be callled thousands of times  during the
          estimation process, setting 'trace = 1' will print too much 
          information very rapidly, which may not be useful. Defaults
          to '0'.

     '_g_m_a_x_i_t' The maximum number of global iterations. Defaults to
          '150'.  

     '_m_a_x_i_t' The maximum number of Nelder-Mead iterations.  Defaults to
          '5000'.  

     '_r_e_l_t_o_l' Relative convergence tolerance of Nelder-Mead.   The
          algorithm stops if it is unable to reduce the value by a
          factor of 'reltol * (abs(val) + reltol)' at a step.  Defaults
          to '1e-5'.

     '_g_l_o_b_a_l_t_o_l' Convergence tolerance of the outer (alternating)
          iteration. The estimation process converges if the gain in
          loglikelihood from one complete cycle of the outer iteration
          is less than 'globaltol'. Defaults to '1e-10'. 

     '_a_l_p_h_a', '_b_e_t_a', '_g_a_m_m_a' Scaling parameters for the Nelder-Mead
          method. 'alpha' is the reflection factor (default 1.0),
          'beta' the contraction factor (0.5) and 'gamma' the expansion
          factor (2.0).

_V_a_l_u_e:

     Returns an object of class 'multiscan' with components 

    call: The call of the 'multiscan' function.

    beta: A vector of length 'm' containing the maximum likelihood
          estimates of  the scanning effects, the first component fixed
          at '1'.

   scale: A vector of length '3' containing the maximum likelihood
          estimates of  the scale parameters $sigma_1, sigma_2,
          \mbox{and} \nu$.

      mu: A vector of length 'n' containing the estimated gene
          expressions.

    data: A matrix of the input data with columns rearranged in order
          of scanner's sensitivity.

  fitted: A matrix of the fitted model on the 'data'.

   sdres: A matrix of the standardised residuals.

 outerit: Number of  global iterations completed.

gconv, conv, convmu: Integer convergence codes. 


          '_g_c_o_n_v' Indicator of global convergence. '0' indicates
               successful convergence, '1' indicates premature
               termination.

          '_c_o_n_v' Convergence codes for the Nelder-Mead simplex method
               in the last global iteration while  updating scanning
               effects and scale parameters. '0' for successful
               convergence, '1' indicates that the iteration limit
               'maxit' had been reached, '10'  indicates degeneracy of
               the Nelder-Mead simplex method.

          '_c_o_n_v_m_u' Convergence codes for the Nelder-Mead simplex method
               in the last global iteration while  updating the gene
               expression parameters. This is an integer vector of
               length 'n' where each component takes the value '0',
               '1', or '10' depending on whether the  Nelder-Mead
               simplex method successfully converged, reached iteration
               limit 'maxit' or produced degeneracy respectively while
               updating the corresponding gene expression parameter.


'outerit': Number of  global iterations completed.

 'loglf': Value of the loglikelihood function at convergence
          ('gconv=0'). 'NA' if not converged ('gconv=1').

_R_e_f_e_r_e_n_c_e_s:

     Khondoker, M. R., Glasbey, C. A. and Worton, B. J. (2006).
     Statistical estimation of gene expression using multiple laser
     scans of microarrays. _Bioinformatics_ *22*, 215-219.

     Nelder, J. A. and Mead, R. (1965).  A simplex method for function
     minimization.  _The Computer Journal_ *7* 308-313.

_S_e_e _A_l_s_o:

     A web interface, created by David Nutter of Biomathematics &
     Statistics Scotland (BioSS), based on the original  Fortran code
     written by Khondoker _et al._ (2006) is available at
     http://www.bioss.ac.uk/ktshowcase/create.cgi. Although it uses the
     same algorithm, results from the web interface may not be exactly
     identical to that of 'multiscan' as it uses a different (non-free
     IMSL routine) implementation of Nelder-Mead simplex.

_E_x_a_m_p_l_e_s:

     ## load the multiscan library 
     library(multiscan)

     ## load the murine data set included in multiscan package
     data(murine)
     murine[1:10,] ## see first few rows of data

     ## fit the model on murine data with default options
     fit<-multiscan(murine)
     fit

     ## plot the fitted model
     plot(fit)

     ## get the estimated gene expressions
     gene.exprs<-fit$mu

     ## see more details as iteration progresses

     fit1<-multiscan(murine, verbose = TRUE)
     fit1

