-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathteaching_notes.txt
64 lines (37 loc) · 2.8 KB
/
teaching_notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
This should be a 3 hour workshop.
Setup: I had problems installing packages in RStudio Cloud, due to running out of memory while compiling a package. Failure seemed to occur at the optional "byte compilation" step, and seems to be mitigated by INSTALL_opts="--no-byte-compile".
Mention relevant events such as BioInfoSummer, COMBINE events.
# Things to mention during the slideshow:
R has 4 decades of history.
Tidyverse is a set of modern packages.
Tidyverse is easy to pick up, has made R much more widely accessible. Key concepts are tidy data and pipelines.
Bioconductor is a separate package repository from CRAN (the usual package repository).
Bioconductor packages are for dealing with biological data efficiently. Bioconductor is not tidy. Expect to have to work a little harder to make use of these.
---
Too many uses to count, ask a different person and get a different list.
50% of bioinformatics is read this file format, wrangle, write other file format.
Often used in combination with command-line software (eg read aligner).
---
matrix is another tabular data type from base R. Conceptually, each cell is an observation, rather than each row as in a tidy data frame.
The rest are Bioconductor types. Note the use of CamelCase, this is typical in Bioconductor. Tidyverse separates words with underscores, and base R uses dots!
Bioconductor has evolved over time to cope with more complex data.
---
Aim for this session is to cover fundamentals of Bioconductor, so you can look up the package you need for your particular task and understand how to use it.
# 2019-09-17, 10am-1pm
- Needed a 15 minute break around 11:15am.
- Finished 12:45pm. May be better to slow down a little in the last sections.
- The Seqinfo section interrupts the flow a little, maybe move it?
- motifRG failed on RStudio Cloud.
- Attendees wanted more challenges, and more time on challenges. (I gave 10 minutes on the start/stop codon challenge. Maybe extend to 15 minutes?)
- More Eukaryote?
- More visual explanation?
# 2021-03-16, 2pm-5pm, Zoom, 1 instructor + 2 helpers
- Of 40 registrants, ~20 turned up. Next time, go for larger number of registrants.
- People dropped out, down to ~10 by the end. Understandable, the material is what it is.
- Added some more material on IGV and data sources.
- Used two 10 minute breaks. Seems about right.
- Didn't do section 5 on motif finding.
- Too much material, but people brains seemed pretty full, so maybe best to keep current length.
- Class was fairly unresponsive, but did engage with the challenges. Morning sessions may be better?
- flank and resize arguments should be explained in detail before the start/stop codon challenge.
- Told people to use their own computer for preference. A couple of people had problems (Windows and Mac). Maybe emphasize using Rstudio Cloud if any install problems.