Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add slice_ functions #68

Merged
merged 27 commits into from
Aug 22, 2023
Merged
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
80665ae
slice_head
wvictor14 Aug 19, 2023
9fcea46
slice_tail
wvictor14 Aug 19, 2023
222fef8
use <- instead of = assignment
wvictor14 Aug 19, 2023
bf18ec4
fix example
wvictor14 Aug 19, 2023
d5dd780
slice min
wvictor14 Aug 19, 2023
bd5a53a
tests for slice_head _tail _min
wvictor14 Aug 19, 2023
5640700
slice_max
wvictor14 Aug 19, 2023
bcd55fe
add by example
wvictor14 Aug 19, 2023
582f57d
add .by to slice
wvictor14 Aug 19, 2023
b80fed3
drop exports for slice functions
wvictor14 Aug 19, 2023
1d2ed45
extract minimal number of columns
wvictor14 Aug 19, 2023
d0b8591
approach b - create slice df from scratch
wvictor14 Aug 19, 2023
7489035
Revert "approach b - create slice df from scratch"
wvictor14 Aug 20, 2023
140d4ba
extract only necessary columns slice_
wvictor14 Aug 20, 2023
c181a3c
add a slice test
wvictor14 Aug 20, 2023
02493b1
slice_sample use only necessary metadata variables
wvictor14 Aug 20, 2023
4e679cb
slice_sample docs
wvictor14 Aug 20, 2023
b0f4ef2
replace with native pipe
wvictor14 Aug 20, 2023
0e173ff
remove :: , importFrom tibble::rowid_to_column
wvictor14 Aug 20, 2023
7420193
export slice functions
wvictor14 Aug 21, 2023
7cca3f8
fix slice_min slice_max tibble input
wvictor14 Aug 21, 2023
9e742d1
test for slice_min _max tibble input
wvictor14 Aug 21, 2023
fe67aa5
add return_args to utilities.R
wvictor14 Aug 21, 2023
dab807b
add cran note
wvictor14 Aug 21, 2023
f2b7772
expand abbreviations
wvictor14 Aug 21, 2023
90fd5b5
expand more abbreviations
wvictor14 Aug 22, 2023
468f29f
fix return_arguments_of docs
wvictor14 Aug 22, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 24 additions & 8 deletions R/dplyr_methods.R
Original file line number Diff line number Diff line change
Expand Up @@ -901,12 +901,30 @@ slice.Seurat <- function (.data, ..., .by = NULL, .preserve = FALSE)
#' @param weight_by <[`data-masking`][dplyr_data_masking]> Sampling weights.
#' This must evaluate to a vector of non-negative numbers the same length as
#' the input. Weights are automatically standardised to sum to 1.
#'
#' @examples
#' # slice_sample() allows you to random select with or without replacement
#' pbmc_small |> slice_sample(n = 5)
#'
#' # if using replacement, and duplicate cells are returned, a tibble will be
#' # returned because duplicate cells cannot exist in Seurat objects
#' pbmc_small |> slice_sample(n = 1, replace = TRUE) # returns Seurat
#' pbmc_small |> slice_sample(n = 100, replace = TRUE) # returns tibble
#'
#' # weight by a variable
#' pbmc_small |> slice_sample(n = 5, weight_by = nCount_RNA)
#'
#' # sample by group
#' pbmc_small |> slice_sample(n = 5, by = groups)
#'
#' # sample using proportions
#' pbmc_small |> slice_sample(prop = 0.10)
#'
NULL

#' @export
slice_sample.Seurat <- function(.data, ..., n = NULL, prop = NULL, by = NULL, weight_by = NULL, replace = FALSE) {


# Solve CRAN NOTES
cell = NULL
. = NULL
Expand All @@ -917,29 +935,27 @@ slice_sample.Seurat <- function(.data, ..., n = NULL, prop = NULL, by = NULL, we
new_meta =
.data[[]] %>%
as_tibble(rownames = c_(.data)$name) %>%
dplyr::slice_sample(..., n = n, by = by, weight_by = weight_by, replace = replace)
dplyr::select(-everything(), c_(.data)$name, {{ by }}, {{ weight_by }}) |>
dplyr::slice_sample(..., n = n, by = {{ by }}, weight_by = {{ weight_by }}, replace = replace)
else if(!is.null(prop))
new_meta =
.data[[]] %>%
as_tibble(rownames = c_(.data)$name) %>%
dplyr::slice_sample(..., prop=prop, by = by, weight_by = weight_by, replace = replace)
dplyr::select(-everything(), c_(.data)$name, {{ by }}, {{ weight_by }}) |>
dplyr::slice_sample(..., prop=prop, by = {{ by }}, weight_by = {{ weight_by }}, replace = replace)
else
stop("tidyseurat says: you should provide `n` or `prop` arguments")

count_cells = new_meta %>% select(!!c_(.data)$symbol) %>% count(!!c_(.data)$symbol)

# If repeted cells
# If repeated cells due to replacement
if(count_cells$n %>% max() %>% gt(1)){
message("tidyseurat says: When sampling with replacement a data frame is returned for independent data analysis.")
.data %>%
as_tibble() %>%
right_join(new_meta %>% select(!!c_(.data)$symbol), by = c_(.data)$name)
} else{
new_obj = subset(.data, cells = new_meta %>% pull(!!c_(.data)$symbol))
[email protected] =
new_meta %>%
data.frame(row.names=pull(.,!!c_(.data)$symbol), check.names = FALSE) %>%
select(- !!c_(.data)$symbol)
stemangiola marked this conversation as resolved.
Show resolved Hide resolved
new_obj
}

Expand Down