| alignTargetedRuns {DIAlignR} | R Documentation |
This function expects osw and mzml directories at dataPath. It first reads osw files and fetches chromatogram indices for each analyte. It then align XICs of each analyte to its reference XICs. Best peak, which has lowest m-score, about the aligned retention time is picked for quantification.
alignTargetedRuns(dataPath, alignType = "hybrid", analyteInGroupLabel = FALSE, oswMerged = TRUE, runs = NULL, analytes = NULL, nameCutPattern = "(.*)(/)(.*)", maxFdrQuery = 0.05, maxFdrLoess = 0.01, analyteFDR = 0.01, spanvalue = 0.1, runType = "DIA_Proteomics", normalization = "mean", simMeasure = "dotProductMasked", XICfilter = "sgolay", SgolayFiltOrd = 4, SgolayFiltLen = 9, goFactor = 0.125, geFactor = 40, cosAngleThresh = 0.3, OverlapAlignment = TRUE, dotProdThresh = 0.96, gapQuantile = 0.5, hardConstrain = FALSE, samples4gradient = 100, samplingTime = 3.4, RSEdistFactor = 3.5, saveFiles = FALSE)
dataPath |
(char) Path to mzml and osw directory. |
alignType |
Available alignment methods are "global", "local" and "hybrid". |
analyteInGroupLabel |
(logical) TRUE for getting analytes as PRECURSOR.GROUP_LABEL from osw file. |
oswMerged |
(logical) TRUE for experiment-wide FDR and FALSE for run-specific FDR by pyprophet. |
runs |
(A vector of string) Names of mzml file without extension. |
analytes |
(vector of strings) transition_group_ids for which features are to be extracted. analyteInGroupLabel must be set according the pattern used here. |
nameCutPattern |
(string) regex expression to fetch mzML file name from RUN.FILENAME columns of osw files. |
maxFdrQuery |
(numeric) A numeric value between 0 and 1. It is used to filter features from osw file which have SCORE_MS2.QVALUE less than itself. |
maxFdrLoess |
(numeric) A numeric value between 0 and 1. Features should have m-score lower than this value for participation in LOESS fit. |
analyteFDR |
(numeric) only analytes that have m-score less than this, will be included in the output. |
spanvalue |
(numeric) Spanvalue for LOESS fit. For targeted proteomics 0.1 could be used. |
runType |
(char) This must be one of the strings "DIA_proteomics", "DIA_Metabolomics". |
normalization |
(character) Must be selected from "mean", "l2". |
simMeasure |
(string) Must be selected from dotProduct, cosineAngle, cosine2Angle, dotProductMasked, euclideanDist, covariance and correlation. |
XICfilter |
(string) This must be one of the strings "sgolay", "none". |
SgolayFiltOrd |
(integer) It defines the polynomial order of filer. |
SgolayFiltLen |
(integer) Must be an odd number. It defines the length of filter. |
goFactor |
(numeric) Penalty for introducing first gap in alignment. This value is multiplied by base gap-penalty. |
geFactor |
(numeric) Penalty for introducing subsequent gaps in alignment. This value is multiplied by base gap-penalty. |
cosAngleThresh |
(numeric) In simType = dotProductMasked mode, angular similarity should be higher than cosAngleThresh otherwise similarity is forced to zero. |
OverlapAlignment |
(logical) An input for alignment with free end-gaps. False: Global alignment, True: overlap alignment. |
dotProdThresh |
(numeric) In simType = dotProductMasked mode, values in similarity matrix higher than dotProdThresh quantile are checked for angular similarity. |
gapQuantile |
(numeric) Must be between 0 and 1. This is used to calculate base gap-penalty from similarity distribution. |
hardConstrain |
(logical) If FALSE; indices farther from noBeef distance are filled with distance from linear fit line. |
samples4gradient |
(numeric) This parameter modulates penalization of masked indices. |
samplingTime |
(numeric) Time difference between two data-points in each chromatogram. For hybrid and local alignment, samples are assumed to be equally time-spaced. |
RSEdistFactor |
(numeric) This defines how much distance in the unit of rse remains a noBeef zone. |
saveFiles |
(logical) Must be selected from light, medium and heavy. |
Two tables of intensity and rention times for every analyte in each run.
Shubham Gupta, shubh.gupta@mail.utoronto.ca
ORCID: 0000-0003-3500-8152
License: (c) Author (2019) + GPL-3 Date: 2019-12-14
Gupta S, Ahadi S, Zhou W, Röst H. "DIAlignR Provides Precise Retention Time Alignment Across Distant Runs in DIA and Targeted Proteomics." Mol Cell Proteomics. 2019 Apr;18(4):806-817. doi: https://doi.org/10.1074/mcp.TIR118.001132 Epub 2019 Jan 31.
getRunNames, getOswFiles, getAnalytesName, getMappedRT
dataPath <- system.file("extdata", package = "DIAlignR")
runs <- c("hroest_K120809_Strep0%PlasmaBiolRepl2_R04_SW_filt",
"hroest_K120809_Strep10%PlasmaBiolRepl2_R04_SW_filt")
intensityTbl <- alignTargetedRuns(dataPath, runs = runs, analytes = c("QFNNTDIVLLEDFQK_3"),
analyteInGroupLabel = FALSE)
intensityTbl <- alignTargetedRuns(dataPath, runs = runs, analytes = c("14299_QFNNTDIVLLEDFQK/3"),
analyteInGroupLabel = TRUE)