generated from r4ds/bookclub-template
-
Notifications
You must be signed in to change notification settings - Fork 2
/
01_getting-started.Rmd
119 lines (85 loc) · 3.25 KB
/
01_getting-started.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
# Getting started
**Learning objectives:**
- What do I need to get started?
- Motivating examples
- Big picture deep learning with fastai
## Tools and Resources{-}
- python
- pytorch
- fastai: built on pytorch
- jupyter notebook
- cloud servers
- Meta Learning book
- Deep Learning for Coders book (materials in videos presented differently)
## Prerequisites {-}
This course doesn't teach python.
**What you don't need to do deep learning**
Myth (don't need) | Truth
------------------| ------------------------------
Lots of math | Just high school math is sufficient
Lots of data | We've seen record-breaking results with <50 items of data
Lots of expensive computers | You can get what you need for state of the art work for free
## Is it a bird system? {-}
- reminder: images are numbers!
- What can be an image? sounds, time series, trajectory data
- introduce python workflow for getting images from duckduckgo
- `DataBlock()` for setting up data for CV models
- runs a CNN live while streaming on 400 images
## Where is deep learning? {-}
- course occurred ~ 2 years ago (July 2022)
- dall-e: text to images
- midjourney: text to images
- deep learning + art
- Google Pathways language model: answers questions w/ explanations
- ethics.fast.ai: another course
## Deep Learning {-}
- With a NN, we don't hand code features, but it learns them through examples
- Deep learning composes these features
- In general, the deeper features are more specialized/localized
## Jupyter and the cloud {-}
- literate programming: interweaves code cells and markdown cells
- colab
- kaggle: tutorial and course code available
- course instructor claims to use a lot of functional style
## DataBlock() {-}
- getting data to model is most important data science task
- most existing model architectures work well across systems
- main arguments:
+ what type of input?
+ what type of output?
+ where is the training data?
+ how should the data be split? (validation set is a requirement)
+ where is the training output?
+ how should the training items be transformed?
- then use a learner to pair the data with the model type
- intermediate, flexible approach
- specialized functions for common data structures
## Pretrained models {-}
- fastai is *fast* because it uses pretrained models
- pretrained: don't start with a random network, but with weights from a model previously trained on a different dataset
- used in image models
## Types of problems where DL is state-of-art {-}
- computer vision: identify what's in an image
- segmentation: combine pixels that correspond to one object/item/etc
- tabular analysis: regression/classification
- collaborative filtering - recommendation systems
- NLP
- Robotics
- Playing games
- Medicine/biology applications
## When does DL win? {-}
Is it something a (expert) human can do quickly?
Does it take a logical thought process based on sparse data?
A trained model is just something that maps inputs to results like another computer program, but getting to the end model takes some tricks.
## Course Video HW {-}
- Experiment with week 1 lab
- Read book chapter
## Meeting Videos {-}
### Cohort 1 {-}
`r knitr::include_url("https://www.youtube.com/embed/URL")`
<details>
<summary> Meeting chat log </summary>
```
LOG
```
</details>