:::{updated} F21 :::
In this week, we are going to talk more about unsupervised learning — learning without labels. We are not going to have time to investigate these techniques very deeply, but I want you to know about them, and you are experimenting with them in Assignment 6.
This week's content is lighter, since we just had a large assignment and a midterm, and another assignment is due on Sunday.
:::{module} week13 :folder: 0ad54e44-0249-4039-b405-adc601833667 :::
- Quiz 13, {date}
wk13 thu long
- Assignment 6, {date}
wk13 sun long
In this video, we review the idea of supervised learning and contrast it with unsupervised learning.
:::{video} unsupervised-intro :name: 13-1 - No Supervision :length: 2m51s :::
This video introduces the idea of matrix decomposition, which we can use to reduce the dimensionality of data points.
:::{index} matrix decomposition :::
:::{video} decomp :name: 13-2 - Decomposing Matrices :length: 17m22s :::
- The next notebook
- The PCADemo, demonstrating the PCA plots
- {py:class}
numpy.ndarray
- {py:mod}
scipy.sparse
- {py:func}
scipy.linalg.svd
- {py:func}
scipy.sparse.linalg.svds
- {py:class}
sklearn.decomposition.TruncatedSVD
- {py:class}
sklearn.decomposition.PCA
The Movie Decomposition notebook demonstrates matrix decomposition with movie data.
This video introduces the concept of clustering, another useful unsupervised learning technique.
:::{video} :name: 13-3 - Clustering :length: 6m56s :::
- {py:class}
sklearn.cluster.KMeans
The clustering example notebook shows how to use the KMeans
class.
This video talks about vector spaces and transforms.
:::{video} :name: 13-4 - Vectors and Spaces :length: 7m27s :::
- Linear Algebra Done Right by Sheldon Axler
- Handbook of Linear Algebra (terse and comprehensive reference)
This video introduces the idea of entropy as a way to quantify information. It's something I want to make sure you've seen at least once by the end of the class.
:::{video} entropy :name: 13-5 - Information and Entropy :length: 10m31s :::
- An Introduction to Information Theory: Symbols, Signals & Noise by John R. Pierce
- Entropy (information theory) on Wikipedia
Take the Week 13 quiz on {{LMS}}.
The Week 13 Exercise notebook demonstrates latent semantic analysis on paper abstracts and has an exercise to classify text into new or old papers.
It requires the {download}chi-papers.csv <../resources/data/chi-papers.csv>
file, which is derived from the HCI Bibliography.
It is the abstracts from papers published at the CHI conference (the primary conference for human-computer interaction) over a period of nearly 40 years.
If you want to see how to create this file, see the Fetch CHI Papers example.
Assignment 6 is due {date}wk13 sun long
.