| combinePeaks {Spectra} | R Documentation |
combinePeaks aggregates provided peak matrices into a single peak matrix.
Peaks are grouped by their m/z values with the group() function from the
MsCoreUtils package. In brief, all peaks in all provided spectra are first
ordered by their m/z and consecutively grouped into one group if the
(pairwise) difference between them is smaller than specified with parameter
tolerance and ppm (see group() for grouping details and examples).
The m/z and intensity values for the resulting peak matrix are calculated
using the mzFun and intensityFun on the grouped m/z and intensity values.
Note that only the grouped m/z and intensity values are used in the
aggregation functions (mzFun and intensityFun) but not the number of
spectra.
The function supports also different strategies for peak combinations which
can be specified with the peaks parameter:
peaks = "union" (default): report all peaks from all input spectra.
peaks = "intersect": keep only peaks in the resulting peak matrix that
are present in >= minProp proportion of input spectra. This would
generate a consensus or representative spectra from a set of e.g.
fragment spectra measured from the same precursor ion.
As a special case it is possible to report only peaks in the resulting
matrix from peak groups that contain a peak from one of the input spectra,
which can be specified with parameter main. Thus, if e.g. main = 2 is
specified, only (grouped) peaks that have a peak in the second input matrix
are returned.
Setting timeDomain to TRUE causes grouping to be performed on the square
root of the m/z values (assuming a TOF instrument was used to create the
data).
combinePeaks(
x,
intensityFun = base::mean,
mzFun = base::mean,
weighted = FALSE,
tolerance = 0,
ppm = 0,
timeDomain = FALSE,
peaks = c("union", "intersect"),
main = integer(),
minProp = 0.5,
...
)
x |
|
intensityFun |
|
mzFun |
|
weighted |
|
tolerance |
|
ppm |
|
timeDomain |
|
peaks |
|
main |
optional |
minProp |
|
... |
additional parameters to the |
For general merging of spectra, the tolerance and/or ppm should be
manually specified based on the precision of the MS instrument. Peaks
from spectra with a difference in their m/z being smaller than tolerance
or smaller than ppm of their m/z are grouped into the same final peak.
Some details for the combination of consecutive spectra of an LC-MS run:
The m/z values of the same ion in consecutive scans (spectra) of a LC-MS run
will not be identical. Assuming that this random variation is much smaller
than the resolution of the MS instrument (i.e. the difference between
m/z values within each single spectrum), m/z value groups are defined
across the spectra and those containing m/z values of the main spectrum
are retained.
Intensities and m/z values falling within each of these m/z groups are
aggregated using the intensityFun and mzFun, respectively. It is
highly likely that all QTOF profile data is collected with a timing circuit
that collects data points with regular intervals of time that are then later
converted into m/z values based on the relationship t = k * sqrt(m/z). The
m/z scale is thus non-linear and the m/z scattering (which is in fact caused
by small variations in the time circuit) will thus be different in the lower
and upper m/z scale. m/z-intensity pairs from consecutive scans to be
combined are therefore defined by default on the square root of the m/z
values. With timeDomain = FALSE, the actual m/z values will be used.
Peaks matrix with m/z and intensity values representing the aggregated
values across the provided peak matrices.
Johannes Rainer
set.seed(123)
mzs <- seq(1, 20, 0.1)
ints1 <- abs(rnorm(length(mzs), 10))
ints1[11:20] <- c(15, 30, 90, 200, 500, 300, 100, 70, 40, 20) # add peak
ints2 <- abs(rnorm(length(mzs), 10))
ints2[11:20] <- c(15, 30, 60, 120, 300, 200, 90, 60, 30, 23)
ints3 <- abs(rnorm(length(mzs), 10))
ints3[11:20] <- c(13, 20, 50, 100, 200, 100, 80, 40, 30, 20)
## Create the peaks matrices
p1 <- cbind(mz = mzs + rnorm(length(mzs), sd = 0.01),
intensity = ints1)
p2 <- cbind(mz = mzs + rnorm(length(mzs), sd = 0.01),
intensity = ints2)
p3 <- cbind(mz = mzs + rnorm(length(mzs), sd = 0.009),
intensity = ints3)
## Combine the spectra. With `tolerance = 0` and `ppm = 0` only peaks with
## **identical** m/z are combined. The result will be a single spectrum
## containing the *union* of mass peaks from the individual input spectra.
p <- combinePeaks(list(p1, p2, p3))
## Plot the spectra before and after combining
par(mfrow = c(2, 1), mar = c(4.3, 4, 1, 1))
plot(p1[, 1], p1[, 2], xlim = range(mzs[5:25]), type = "h", col = "red")
points(p2[, 1], p2[, 2], type = "h", col = "green")
points(p3[, 1], p3[, 2], type = "h", col = "blue")
plot(p[, 1], p[, 2], xlim = range(mzs[5:25]), type = "h",
col = "black")
## The peaks were not merged, because their m/z differs too much.
## Combine spectra with `tolerance = 0.05`. This will merge all triplets.
p <- combinePeaks(list(p1, p2, p3), tolerance = 0.05)
## Plot the spectra before and after combining
par(mfrow = c(2, 1), mar = c(4.3, 4, 1, 1))
plot(p1[, 1], p1[, 2], xlim = range(mzs[5:25]), type = "h", col = "red")
points(p2[, 1], p2[, 2], type = "h", col = "green")
points(p3[, 1], p3[, 2], type = "h", col = "blue")
plot(p[, 1], p[, 2], xlim = range(mzs[5:25]), type = "h",
col = "black")
## With `intensityFun = max` the maximal intensity per peak is reported.
p <- combinePeaks(list(p1, p2, p3), tolerance = 0.05,
intensityFun = max)
## Create *consensus*/representative spectrum from a set of spectra
p1 <- cbind(mz = c(12, 45, 64, 70), intensity = c(10, 20, 30, 40))
p2 <- cbind(mz = c(17, 45.1, 63.9, 70.2), intensity = c(11, 21, 31, 41))
p3 <- cbind(mz = c(12.1, 44.9, 63), intensity = c(12, 22, 32))
## No mass peaks identical thus consensus peaks are empty
combinePeaks(list(p1, p2, p3), peaks = "intersect")
## Reducing the minProp to 0.2. The consensus spectrum will contain all
## peaks
combinePeaks(list(p1, p2, p3), peaks = "intersect", minProp = 0.2)
## With a tolerance of 0.1 mass peaks can be matched across spectra
combinePeaks(list(p1, p2, p3), peaks = "intersect", tolerance = 0.1)
## Report the minimal m/z and intensity
combinePeaks(list(p1, p2, p3), peaks = "intersect", tolerance = 0.1,
intensityFun = min, mzFun = min)