Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support the analysis of matched samples #196

Open
antagomir opened this issue Dec 31, 2021 · 2 comments
Open

Support the analysis of matched samples #196

antagomir opened this issue Dec 31, 2021 · 2 comments
Assignees

Comments

@antagomir
Copy link
Member

It is relatively common that an experimental study has paired samples, for instance baseline and intervention time points from the same individuals. Then one would prefer paired tests and visualizations. Potentially there could be also more time points, with such matching.

This could be supported by providing handy methods to split the SE object into such matched assays. The altExp structure could be suitable for this (this would allow having this simultaneously for multiple experiments in MAE, if necessary). If there is a separate time series structure available for SE/altExp/MAE, that would be even better but I am not aware of such solution.

The method could be something like splitByPair(se, pair_field="subject") and this would split the original assay in se into multiple altExp assays. The columns should match between them (each column is one subject, and named accordingly).

There are (at least) the following potential problems, however:

  1. colData is per sample, and if we split to altExps where columns correspond to subjects (not samples), then we could loose the information that is sample-specific. How to retain the sample-specific information?

  2. Some samples do not match (some subjects are missing some samples etc). Then the method could either discard the non-matched samples, or fill in NA for the missing data. Neither solution is ideal.

  3. The altExp assays might have an order (for instance baseline to intervention; or more generally sequence across time). There would need to be a way to provide ordering for altExp tables but the structure does not naturally support this.

If generic solution for time series etc. seems too difficult, perhaps there could anyway be a simple method that allows splitting paired samples into altExp/MAE to support fast exploration.

@antagomir antagomir changed the title Support the analysis of paired samples with altExp Support the analysis of matched samples Dec 31, 2021
@FelixErnst
Copy link
Contributor

In my opinion you are thinking to complicated about this.

The first step is to have one or in this case two factor(s) in the colData. Whether these factors are subjects, timepoint or who knows what is irrelenvant. If you implement a function which takes into consideration two factors, one for the matching and for the test, you are done, aren't you?

Any splitting is easy: just use splitBy as proposed in #191

@antagomir
Copy link
Member Author

Yes, if we are ok with having the output from splitBy as a list, rather than use altExp, MAE, or other such mechanism and their associated methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

3 participants