[Add] texts for website in decoding chapter

slds-lmu · Aug 12, 2024 · 32fd6b1 · 32fd6b1
1 parent 0adfe6b
commit 32fd6b1
Show file tree

Hide file tree

Showing 5 changed files with 21 additions and 17 deletions.
diff --git a/content/chapters/08_decoding/08_01_intro.md b/content/chapters/08_decoding/08_01_intro.md
@@ -2,15 +2,11 @@
 title: "Chapter 08.01: What is Decoding?"
 weight: 8001
 ---
-
+Here we introduce the concept of decoding. Given a prompt and a generative language model, how does it generate text? The model produces a probability distribution over all tokens in the vocabulary. The way the model uses that probability distribution to generate the next token is what is called a decoding strategy.
 
 
 <!--more-->
 
 ### Lecture Slides
 
 {{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter12-decoding/slides-121-intro.pdf" >}}
-
-### References
-
-- [1] [Radford et al., 2018](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)
diff --git a/content/chapters/08_decoding/08_02_determ.md b/content/chapters/08_decoding/08_02_determ.md
@@ -2,15 +2,16 @@
 title: "Chapter 08.02: Greedy & Beam Search"
 weight: 8002
 ---
-
-
+Here we introduce two deterministic decoding strategies, greedy & beam search. Both methods are determenistic, which means there is no sampling involved when generating text. While greedy decoding always chooses the token with the highest probability, while beam search keeps track of multiple beams to generate the next token.
 
 <!--more-->
 
 ### Lecture Slides
 
 {{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter12-decoding/slides-122-determ.pdf" >}}
 
-### References
+### Additional Resources 
+
+- [d2l book chapter about greedy and beam search](https://d2l.ai/chapter_recurrent-modern/beam-search.html)
+
 
-- [1] [Radford et al., 2018](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)
diff --git a/content/chapters/08_decoding/08_03_sampling.md b/content/chapters/08_decoding/08_03_sampling.md
@@ -2,7 +2,7 @@
 title: "Chapter 08.03: Stochastic Decoding & CS/CD"
 weight: 8003
 ---
-
+In this chapter you will learn about more methods beyond simple deterministic decoding strategies. We introduce sampling with temperature, where you add a temperature parameter into the softmax formula, top-k [1] and top-p [2] sampling, where you sample from a set of top tokens and finally contrastive search [3] and contrastive decoding [4].
 
 
 <!--more-->
@@ -13,4 +13,7 @@ weight: 8003
 
 ### References
 
-- [1] [Radford et al., 2018](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)
+- [1] [Fan et al., 2018](https://arxiv.org/abs/1805.04833)
+- [2] [Holtzman et al., 2019](https://arxiv.org/abs/1904.09751)
+- [3] [Su et al., 2022](https://arxiv.org/abs/2210.14140)
+- [4] [Li et al., 2023](https://arxiv.org/abs/2210.15097)
diff --git a/content/chapters/08_decoding/08_04_hyper_param.md b/content/chapters/08_decoding/08_04_hyper_param.md
@@ -2,7 +2,7 @@
 title: "Chapter 08.04: Decoding Hyperparameters & Practical considerations"
 weight: 8004
 ---
-
+In this chapter you will learn how to use the different decoding strategies in practice. When using models from huggingface you can choose the decoding strategy by specifying the hyperparameters of the `generate` method of those models.
 
 
 <!--more-->
@@ -11,6 +11,7 @@ weight: 8004
 
 {{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter12-decoding/slides-124-hyper-param.pdf" >}}
 
-### References
+### Additional Resources 
+
+- [Jupyter notebook](https://github.com/slds-lmu/lecture_dl4nlp/blob/main/code-demos/decoding_examples.ipynb)
 
-- [1] [Radford et al., 2018](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)
diff --git a/content/chapters/08_decoding/08_05_eval_metrics.md b/content/chapters/08_decoding/08_05_eval_metrics.md
@@ -1,8 +1,8 @@
 ---
-title: "Chapter 08.05: Decoding Hyperparameters & Practical considerations"
+title: "Chapter 08.05: Evaluation Metrics"
 weight: 8005
 ---
-
+Here we answer the question on how to evaluate the generated outputs in open ended text generation. We first explain **BLEU** [1] and **ROUGE** [2], which are metrics for tasks with a gold reference. Then we introduce **diversity**, **coherence** [3] and **MAUVE** [4], which are metrics for tasks without a gold reference such as open ended text generation. You will also learn about human evaluation.  
 
 
 <!--more-->
@@ -13,4 +13,7 @@ weight: 8005
 
 ### References
 
-- [1] [Radford et al., 2018](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)
+- [1] [Papineni et al., 2002](https://aclanthology.org/P02-1040.pdf)
+- [2] [Lin, 2004](https://aclanthology.org/W04-1013/)
+- [3] [Su et al., 2022](https://arxiv.org/abs/2202.06417) 
+- [4] [Pillutla et al., 2021](https://arxiv.org/abs/2102.01454)