-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API CHANGE: support peaksData to return data.frame or matrix #289
Comments
Comparing impact of returning a library(Spectra)
fl <- system.file("TripleTOF-SWATH", "PestMix1_DDA.mzML",
package = "msdata")
be <- backendInitialize(MsBackendMzR(), fl)
mem <- backendInitialize(MsBackendMemory(), spectraData(be))
mem2 <- backendInitialize(MsBackendMemory2(), spectraData(be))
class(mem@peaksData[[1L]])
[1] "matrix" "array"
class(mem2@peaksData[[1L]])
[1] "data.frame" The Tests at the
|
Maybe a workable solution would be to allow the backend to return a Thus, we would only run into the above described performance issues if the user has data with additional peak annotations and if he/she requests them from the backend by specifically using e.g. |
I'm developing currently in the jomain branch. First change:
that should enable to start working on ensuring |
- `peaksData,MsBackendMemory` returns by default a `list` of `matrix` or a `list` of `data.frame`s if other peak variables than `"mz"`, `"intensity"` are requested. Issue #289.
In |
The good news: this update already enables the main issue/problem discussed in the last dev call: Create a library(Spectra)
df <- data.frame(rtime = c(1.1, 1.2, 1.3, 1.4),
msLevel = 1L)
df$mz <- list(c(13, 14.1, 22, 23, 24, 49),
c(45.1, 56),
c(34.3, 134.4, 344, 443),
c(12.1, 31))
df$intensity <- list(c(100, 300, 30, 120, 12, 34),
c(345, 234),
c(123, 124, 145, 3),
c(122, 421))
#' add some arbitrary information for each peak to the data.frame
df$ann <- list(c("a", NA, "b", "c", "d", NA),
c("e", "f"),
c("g", "h", "i", NA),
c("j", "k"))
B <- Spectra(df, peaksVariables = c("mz", "intensity", "ann")) peaksData(B)[[1L]]
mz intensity
[1,] 13.0 100
[2,] 14.1 300
[3,] 22.0 30
[4,] 23.0 120
[5,] 24.0 12
[6,] 49.0 34
peaksData(B, columns = peaksVariables(B))[[1L]]
mz intensity ann
1 13.0 100 a
2 14.1 300 <NA>
3 22.0 30 b
4 23.0 120 c
5 24.0 12 d
6 49.0 34 <NA> So, in the first case a B2 <- filterMzValues(B, 23, tolerance = 1)
peaksData(B2, columns = c("mz", "intensity"))[[1L]]
mz intensity
[1,] 22 30
[2,] 23 120
[3,] 24 12 And including additional peak variables: peaksData(B2, columns = peaksVariables(B))[[1L]]
mz intensity ann
3 22 30 b
4 23 120 c
5 24 12 d |
What is not yet working (needs some fixes on the B2$ann[[1L]]
[1] "a" NA "b" "c" "d" NA |
In addition, |
- Ensure lazy evaluation of processing queue in `peaksData,Spectra` does not break if requested `columns` do not contains `"mz"` and `"intensity"`. Issue #289.
- `applyProcessing` ensures that all peaks variables are properly updates/subset depending on the processing queue (issue #289).
spectraData(B2, column = "ann")
DataFrame with 4 rows and 1 column
ann
<list>
1 b,c,d
2
3
4 |
And B3 <- applyProcessing(B2)
B3@backend@peaksDataFrame
[[1]]
ann
3 b
4 c
5 d
[[2]]
[1] ann
<0 rows> (or 0-length row.names)
[[3]]
[1] ann
<0 rows> (or 0-length row.names)
[[4]]
[1] ann
<0 rows> (or 0-length row.names) |
We need also to ensure that replacing peaks variables works properly with a non-empty processing queue (this is in fact a bug in the current implementation). |
- `spectraData<-,Spectra` and `$<-,Spectra` throw an error if processing queue is not empty and values for peaks variables are requested to be replaced. Issue #289.
- Add support for peak variables to `MsBackendDataFrame`. Issue #289. - Add examples for peak variables to `MsBackend` and `Spectra` documentation. - Expand documentation for peaks variables to the `Spectra` vignette.
- Add support for peaks variables to `MsBackendDataFrame` and add/test its funcitionality (issue #289).
Replacing peaks variables works (in fact, throws an error if the processing queue is not empty). After |
This is related to issue #287 and was discussed here: change the
MsBackend
API to supportpeaksData
to return alist
ofdata.frame
instead (or in addition) to alist
ofmatrix
. This would enable native support of peak variables (annotations) that are notnumeric
.This change should not break any functionality, but we need to check also if and how strong it's impact on the performance is.
The text was updated successfully, but these errors were encountered: