Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big update to README, regarding plans for flow of exercises #1

Merged
merged 2 commits into from
Dec 11, 2014
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 53 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,37 +3,73 @@ To teach students to VERB [and VERB] the way that organization can be used to in

##Motivation
Why is it worth organizing?
Graph Illustrating effort verse time. Yes, this is a little painful, but think of how painful it will be during publication process.
Graph Illustrating effort verse time. Yes, this is a little painful, but think of how painful it will be during publication process.

![graph]
(https://raw.githubusercontent.com/tracykteal/shell-genomics/master/img/gvng.jpg)

##Overview of exercises
- get a bunch of files (raw data, scripts, overall .Rmd): run the whole thing to produce clean data and output files
- now have a bunch of files (raw and clean data, scripts, results); how to organize them?
- README file explaining everything
- organize into subfolders (separating code from data)

- Potentially re-run .Rmd: have them actively reorganize and change paths
- Get a version with files reorganized; add some new analysis to the main Rmd file
- Lecture interlude:
- If you gave to friend how do those path names affect that?
1. get set of files (raw data, scripts, overall .Rmd) for gapminder all
in one directory

- raw data: `raw1.csv`, `raw2.csv`, `raw3.csv`
- scripts to clean the data: `clean1.R`, ..., `clean7.R`
- overall Rmd: `project.Rmd`

Run the whole thing to produce clean data (`clean.csv`) and output
files (`project.md`, `project.html`, `sweden.png`).

2. How to organize these files? Students work on this in small groups, to
organize into subfolders (separating code from data).

Come back together to discuss students' solutions; point them
toward a modified version of Bill Nobel's approach.

project.Rmd
doc/paper/
data/raw/raw1.csv
data/raw/raw2.csv
data/raw/raw3.csv
data/clean/clean.csv
code/clean1.R
...
code/clean7.R
results/project.md
results/project.html
results/figure/sweden.png

3. Discussion of paths, particularly regarding relative vs absolute paths.
- what we had to change (paths) to redirect files
- If you gave to friend how do those path names effect that?
- importance of absolute vs relative paths
- what if you were to move this to a new computer?
- Working directory

- metadata exercise: what metadata will you want to record in order to explain what files are what, and which ones produce which others

4. Exercise: we need to do the actual moving around, and we need to

- go into the `code/*.R` files and change the paths to the raw and
clean data
- go into the `project.Rmd` file to change the paths to `clean*.R`
and `data/clean/clean.csv`.
- change `project.Rmd` to send figure to `results/figure/`; change
preferences for knitr to send output to `results/`.

(Potentially skip this exercise and provide them with a reorganized
version.)

5. Add some new analysis to the main Rmd file, and re-run.

6. metadata exercise: what metadata will you want to record in order to explain what files are what, and which ones produce which others
- README file explaining everything

- for yourself, coming back to this 6 months from now
- for a collaborator
- versions: how do they keep track of versions of things? (discussion in small groups and then come back together)

7. versions: how do they keep track of versions of things? (discussion in small groups and then come back together)

- periodic zipping of project directory
- Time Machine or other backup system
- project on dropbox
- github

- (Lecture interlude: demonstration of GitHub)


8. (Lecture interlude: demonstration of GitHub)