Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add slice_ functions #68

Merged
merged 27 commits into from
Aug 22, 2023
Merged

Add slice_ functions #68

merged 27 commits into from
Aug 22, 2023

Conversation

wvictor14
Copy link
Contributor

Hi I found this issue on tidyomics and thought it would be good experience to try

Let me know how it looks and if there's any changes needed

This adds the slice_ family:

  • slice_min
  • slice_max
  • slice_head
  • slice_tail

R/dplyr_methods.R Outdated Show resolved Hide resolved
Comment on lines 884 to 885
tibble::rowid_to_column(var = 'row_number___') |>
dplyr::select(-everything(), row_number___, {{ .by }}) |>
Copy link
Owner

@stemangiola stemangiola Aug 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

Sorry if I am being pedantic here, but it would be better to do select first and rowid_to_column later both from a memory and elegance perspective.

Another little feedback, a standard we use is to do (this applies to all new function calls)

@importFrom tibble ...

and not use

tibble:: ...

With these two changes I think the PR is good to go!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, will do

R/dplyr_methods.R Outdated Show resolved Hide resolved
@stemangiola
Copy link
Owner

@wvictor14 have a look at the github action; it is failing. Check maybe on your local CHECK and BiocCheck because here I have to "allow" github action each time, unfortunately. After you become a contributor, github action will be runnable by you in the future.

@stemangiola
Copy link
Owner

@wvictor14 please add your authorship details here https://docs.google.com/spreadsheets/d/19XqhN3xAMekCJ-esAolzoWT6fttruSEermjIsrOFcoo/edit?usp=sharing

@wvictor14
Copy link
Contributor Author

wvictor14 commented Aug 21, 2023

@stemangiola I fixed the errors for CRAN check. There are two bioccheck errors that I am not sure best to address:

  • ERROR: Maintainer must add package name to Watched Tags on the support
    site; Edit your Support Site User Profile to add Watched Tags.

and

ERROR: At least 80% of man pages documenting exported objects must have
runnable examples.

Also I realized that the option of inputting a tibble to order_by argument for slice_min and slice_max doesn't work with the select |> slice approach. So I added an internal function return_args that returns the variables as symbols of such cases, that can supply the necessary columns to select 7cca3f8

@stemangiola
Copy link
Owner

There are two bioccheck errors that I am not sure best to address

No problem, that could be another "challenge"

Also I realized that the option of inputting a tibble to order_by argument for slice_min and slice_max doesn't work with the select |> slice approach

Could you please give me minimal example ?

@wvictor14
Copy link
Contributor Author

Also I realized that the option of inputting a tibble to order_by argument for slice_min and slice_max doesn't work with the select |> slice approach

Could you please give me minimal example ?

This is what I mean

# in action
pbmc_small |> slice_min(tibble::tibble(nFeature_RNA, nCount_RNA), n = 2)

# return_args extracts the variables out of an expression
order_by <- expr(tibble::tibble(nFeature_RNA, nCount_RNA))
order_by_vars <- return_args(!!order_by)

# let's you input into select
pbmc_small |> select(!!!order_by_vars) |>

  # and then can apply slice_min
  slice_min(!!order_by, n = 6)

# A tibble: 6 × 2
  nFeature_RNA nCount_RNA
         <int>      <dbl>
1           26         51
2           29        157
3           29        172
4           30        150
5           30        202
6           31         62

Not sure if there is a more straightforward way, or how to make it more robust. What do you think?

@stemangiola
Copy link
Owner

Also I realized that the option of inputting a tibble to order_by argument for slice_min and slice_max doesn't work with the select |> slice approach

Could you please give me minimal example ?

This is what I mean

# in action
pbmc_small |> slice_min(tibble::tibble(nFeature_RNA, nCount_RNA), n = 2)

# return_args extracts the variables out of an expression
order_by <- expr(tibble::tibble(nFeature_RNA, nCount_RNA))
order_by_vars <- return_args(!!order_by)

# let's you input into select
pbmc_small |> select(!!!order_by_vars) |>

  # and then can apply slice_min
  slice_min(!!order_by, n = 6)

# A tibble: 6 × 2
  nFeature_RNA nCount_RNA
         <int>      <dbl>
1           26         51
2           29        157
3           29        172
4           30        150
5           30        202
6           31         62

Not sure if there is a more straightforward way, or how to make it more robust. What do you think?

Great! I left a last comment to address (Above) and then this PR is on its way!

@wvictor14
Copy link
Contributor Author

A detail. In the tidy transcriptomics framework, we adopt the standard of, no abbreviations and no acronyms.

Would you mind changing return_args into a self-explanatory function name, e.g. return_arguments_of_... or anything else that would make sense.

This also applies to any variable, e.g. args_expr, args_vars, ...

I think I got them all!

@stemangiola stemangiola merged commit d08b38b into stemangiola:master Aug 22, 2023
2 of 3 checks passed
@stemangiola
Copy link
Owner

stemangiola commented Aug 22, 2023

Thanks a lot @wvictor14 for your PR! Well done.

Now that you are an expert, if you want to translate your PR to the other tidy adapters tidySingleCellExperiment and/or tidySummarizedExperiment feel free, that would be welcome!

No expectations of course.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

modernize tidy transcriptomics and genomics with dplyr > 1.0.0 and new tidyr
2 participants