Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ungroup after multi-level grouping to avoid warnings #891

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ndporter
Copy link
Contributor

Add ungroup() to all examples grouping by multiple variables to properly return output to ungrouped state. Additionally, add a note in explanation of first multi-variable grouping about need to ungroup.

Previously, output displayed on lesson page in these cases looked like the block below and did not display the summarized data at all (though in RStudio, output should display both).

`summarise()` has grouped output by 'continent'. You can override using the
`.groups` argument.

Note: an alternative is (as suggested in warning) to use the .groups = "drop" argument in summarise. Both approaches accomplish the same goal. I suggested ungroup() because it was previously the only option and has been used in other Carpentries lessons.

Add `ungroup()` to all examples grouping by multiple variables to properly return output to ungrouped state. Additionally, add a note in explanation of first multi-variable grouping about need to ungroup.

Previously, output displayed on lesson page in these cases looked like the block below and did not display the summarized data at all (though in RStudio, output should display both).

```
`summarise()` has grouped output by 'continent'. You can override using the
`.groups` argument.
```

*Note: an alternative is (as suggested in warning) to use the `.groups = "drop"` argument in summarise. Both approaches accomplish the same goal. I suggested `ungroup()` because it was previously the only option and has been used in other Carpentries lessons.*
Copy link

github-actions bot commented May 16, 2024

Thank you!

Thank you for your pull request 😃

🤖 This automated message can help you check the rendered files in your submission for clarity. If you have any questions, please feel free to open an issue in {sandpaper}.

If you have files that automatically render output (e.g. R Markdown), then you should check for the following:

  • 🎯 correct output
  • 🖼️ correct figures
  • ❓ new warnings
  • ‼️ new errors

Rendered Changes

🔍 Inspect the changes: https://github.com/swcarpentry/r-novice-gapminder/compare/md-outputs..md-outputs-PR-891

The following changes were observed in the rendered markdown documents:

 01-rstudio-intro.md |    2 +-
 02-project-intro.md |    2 +-
 03-seeking-help.md  |    6 +-
 12-dplyr.md         |   17 +-
 config.yaml (gone)  |   95 -----
 md5sum.txt          |   50 +--
 renv.lock (gone)    | 1021 ---------------------------------------------------
 7 files changed, 41 insertions(+), 1152 deletions(-)
What does this mean?

If you have source files that require output and figures to be generated (e.g. R Markdown), then it is important to make sure the generated figures and output are reproducible.

This output provides a way for you to inspect the output in a diff-friendly manner so that it's easy to see the changes that occur due to new software versions or randomisation.

⏱️ Updated at 2024-05-16 13:26:28 +0000

github-actions bot pushed a commit that referenced this pull request May 16, 2024
@ndporter
Copy link
Contributor Author

In playing around a little, it may be cleaner to use .groups="drop" because ungroup() may still print a warning even if it also renders the output. I'd suggest someone tries out both at least in RStudio with current versions, if not on a local build of the lesson, before deciding which to use.

Copy link
Contributor

@skanwal skanwal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@naupaka - what do you think of this one? Any preferences or favs for grouping data?

@naupaka
Copy link
Member

naupaka commented Jul 30, 2024

Yes I agree this is probably important to address. My instinct is that there needs to be a balance between ungrouping (via either method), which is certainly best practice, and adding too much cognitive overload to new learners. In the old days (before they added in that warning) folks would just leave the data grouped since it only was an issue if the pipe chain was being continued and even then it only led to unexpected behavior in certain instances. In the examples in the lesson, ungrouping is important because the output is stored back into a variable, but I think that's only being done so as to make it display after the image(?). So, the other option here would be to have a callout explaining the warning and leave off the ungroup() as well as the variable assignment from the code just to keep the focus on grouping and summarizing, which is the main focus of the content.

I think it could go either way, but I think if we add in either ungroup() or .groups = "drop" then we probably need slightly more (maybe another sentence or two) of explanation about why this is important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants