From 0bf9b70b342eb104efeedc14f91dad563c71fe18 Mon Sep 17 00:00:00 2001 From: Greg Wilson Date: Fri, 7 Jul 2023 20:01:19 -0400 Subject: [PATCH] feat: show glossary definitions on mouseover Closes #95 --- docs/archive/index.html | 28 +++--- docs/archive/slides/index.html | 18 ++-- docs/bib/index.html | 2 +- docs/binary/index.html | 44 ++++----- docs/binary/slides/index.html | 26 +++--- docs/bonus/index.html | 12 +-- docs/build/index.html | 32 +++---- docs/build/slides/index.html | 16 ++-- docs/check/index.html | 22 ++--- docs/check/slides/index.html | 20 ++-- docs/conduct/index.html | 2 +- docs/contents/index.html | 2 +- docs/contrib/index.html | 2 +- docs/credits/index.html | 2 +- docs/db/index.html | 16 ++-- docs/db/slides/index.html | 18 ++-- docs/debugger/index.html | 16 ++-- docs/debugger/slides/index.html | 12 +-- docs/dup/index.html | 22 ++--- docs/dup/slides/index.html | 16 ++-- docs/finale/index.html | 2 +- docs/finale/slides/index.html | 2 +- docs/ftp/index.html | 22 ++--- docs/ftp/slides/index.html | 24 ++--- docs/func/index.html | 28 +++--- docs/func/slides/index.html | 20 ++-- docs/glob/index.html | 30 +++--- docs/glob/slides/index.html | 10 +- docs/glossary/index.html | 128 +++++++++++++------------- docs/http/index.html | 26 +++--- docs/http/slides/index.html | 32 +++---- docs/index.html | 2 +- docs/interp/index.html | 30 +++--- docs/interp/slides/index.html | 16 ++-- docs/intro/index.html | 2 +- docs/intro/slides/index.html | 2 +- docs/intro/syllabus_linear.svg | 12 +-- docs/intro/syllabus_regular.svg | 12 +-- docs/layout/index.html | 20 ++-- docs/layout/slides/index.html | 6 +- docs/license/index.html | 2 +- docs/lint/index.html | 8 +- docs/lint/slides/index.html | 16 ++-- docs/mock/index.html | 14 +-- docs/mock/slides/index.html | 2 +- docs/oop/index.html | 20 ++-- docs/oop/slides/index.html | 12 +-- docs/pack/index.html | 24 ++--- docs/pack/slides/index.html | 18 ++-- docs/parse/index.html | 20 ++-- docs/parse/slides/index.html | 14 +-- docs/perf/index.html | 30 +++--- docs/perf/slides/index.html | 6 +- docs/persist/index.html | 22 ++--- docs/persist/slides/index.html | 2 +- docs/slides/index.html | 2 +- docs/syllabus/index.html | 2 +- docs/template/index.html | 14 +-- docs/template/slides/index.html | 6 +- docs/test/index.html | 26 +++--- docs/test/slides/index.html | 16 ++-- docs/undo/index.html | 8 +- docs/undo/slides/index.html | 6 +- docs/viewer/index.html | 18 ++-- docs/viewer/slides/index.html | 18 ++-- docs/vm/index.html | 28 +++--- docs/vm/slides/index.html | 24 ++--- info/glossary.yml | 6 ++ lib/mccole/extensions/glossary.py | 57 ++++++++---- lib/mccole/extensions/util.py | 7 +- lib/mccole/templates/definitions.html | 2 +- src/binary/slides.html | 4 +- src/build/slides.html | 2 +- src/db/slides.html | 4 +- src/dup/slides.html | 2 +- src/ftp/slides.html | 2 +- src/http/slides.html | 2 +- src/interp/slides.html | 4 +- src/intro/syllabus_linear.svg | 12 +-- src/intro/syllabus_regular.svg | 12 +-- src/pack/slides.html | 2 +- src/viewer/slides.html | 2 +- 82 files changed, 641 insertions(+), 611 deletions(-) diff --git a/docs/archive/index.html b/docs/archive/index.html index 990e40a61..4c029c0cb 100644 --- a/docs/archive/index.html +++ b/docs/archive/index.html @@ -4,7 +4,7 @@ - + @@ -361,7 +361,7 @@

Chapter 10: A File Archiver

but we’d rather not have to. We’d also like to be able to see what we’ve changed and to collaborate with other people.

-

A version control system +

A version control system like Git solves all of these problems at once. It keeps track of changes to files @@ -551,14 +551,14 @@

Section 10.3: Tracking Backups

keeps track of which files have and haven’t been backed up already. It stores backups in a directory that contains files like abcd1234.bck (the hash followed by .bck) -and creates a manifests +and creates a manifests that describe the content of each snapshot. A real system would support remote storage as well so that losing one hard drive wouldn’t mean losing all our work, so we need to design our system with multiple back ends in mind.

For now, we will store manifests in CSV files named ssssssssss.csv, -where ssssssssss is the UTC timestamp +where ssssssssss is the UTC timestamp of the backup’s creation.

Time of Check/Time of Use

@@ -569,15 +569,15 @@

Time of Check/Time of Use

are the result of programmers assuming things weren’t going to happen.

We could try to avoid this problem by using a two-part naming scheme ssssssss-a.csv, ssssssss-b.csv, and so on, -but this leads to a race condition -called time of check/time of use. +but this leads to a race condition +called time of check/time of use. If two users run the backup tool at the same time, they will both see that there isn’t a file (yet) with the current timestamp, so they will both try to create the first one. -Ensuring that multi-file updates are atomic operations +Ensuring that multi-file updates are atomic operations (i.e., that they always behave a single indivisible step) is a hard problem; -file locking is a common approach, +file locking is a common approach, but complete solutions are out of the scope of this book.

This function creates a backup—or rather, @@ -593,8 +593,8 @@

Time of Check/Time of Use

Writing a high-level function first and then filling in the things it needs -is called successive refinement -or top-down design. +is called successive refinement +or top-down design. In practice, nobody designs code and then implements the design without changes unless they have solved closely-related problems before [Petre2016]. @@ -638,7 +638,7 @@

Time of Check/Time of Use

We will look at ways to fix this in the exercises as well.

What Time Is It?

-

Our backup function relies on a helper function +

Our backup function relies on a helper function called current_time that does nothing but call time.time from Python’s standard library:

@@ -698,7 +698,7 @@

What Time Is It?

Section 10.4: Refactoring

Now that we have a better idea of what we’re doing, -we can go back and create a base class +we can go back and create a base class that prescribes the general steps in creating a backup:

class Archive:
@@ -727,7 +727,7 @@ 

Section 10.4: Refactoring

it makes life easier when we want to write archivers that behave the same way but work differently. For example, -we could create an archiver that compresses +we could create an archiver that compresses files as it archives them by deriving a new class from ArchiveLocal and changing only its _copy_files method.

@@ -776,7 +776,7 @@

JSON Manifests

  • Write another program called migrate.py that converts a set of manifests from CSV to JSON. - (The program’s name comes from the term data migration.)

    + (The program’s name comes from the term data migration.)

  • Modify backup.py programs so that each manifest stores the user name of the person who created it diff --git a/docs/archive/slides/index.html b/docs/archive/slides/index.html index ade636ccd..38af8e32f 100644 --- a/docs/archive/slides/index.html +++ b/docs/archive/slides/index.html @@ -4,7 +4,7 @@ - + @@ -48,7 +48,7 @@

    A File Archiver

    - Want to save snapshots of work in progress -- Create a simple version control system +- Create a simple version control system - And show how to test it using mock objects (Chapter 9) @@ -62,7 +62,7 @@

    A File Archiver

    - Handles renaming -- Then create a manifests to show +- Then create a manifests to show what unique blocks of bytes had what names when --- @@ -131,7 +131,7 @@

    A File Archiver

    - But we want to test what happens when they change, which makes things complicated to maintain -- Use a mock object (Chapter 9) +- Use a mock object (Chapter 9) instead of the real filesystem --- @@ -229,7 +229,7 @@

    A File Archiver

    - Backed-up files are `abcd1234.bck` - Manifests are `ssssssssss.csv`, - where `ssssssssss` is the UTC timestamp + where `ssssssssss` is the UTC timestamp --- @@ -239,7 +239,7 @@

    A File Archiver

    - Manifest naming scheme fails if we try to create two backups in less than one second -- A time of check/time of use race condition +- A time of check/time of use race condition - May seem unlikely, but many bugs and security holes seemed unlikely to their creators @@ -257,7 +257,7 @@

    A File Archiver

    ``` -- An example of successive refinement +- An example of successive refinement --- @@ -340,7 +340,7 @@

    A File Archiver

    ## Refactoring -- Create a base class with the general steps +- Create a base class with the general steps ```py class Archive: @@ -355,7 +355,7 @@

    A File Archiver

    ``` -- Derive a child class to do local archiving +- Derive a child class to do local archiving - Convert functions we have built so far into methods diff --git a/docs/bib/index.html b/docs/bib/index.html index fb5a7ffc7..2f6a13f8f 100644 --- a/docs/bib/index.html +++ b/docs/bib/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/binary/index.html b/docs/binary/index.html index e22406056..7c383f051 100644 --- a/docs/binary/index.html +++ b/docs/binary/index.html @@ -4,7 +4,7 @@ - + @@ -377,9 +377,9 @@

    Section 17.1: Integers

    one positive and one negative. More importantly, the hardware needed to do arithmetic -on this sign and magnitude representation +on this sign and magnitude representation is more complicated than the hardware needed for another scheme -called two’s complement. +called two’s complement. Instead of mirroring positive values, two’s complement rolls over when going below zero like an odometer. For example, @@ -553,7 +553,7 @@

    Section 17.1: Integers

  • Section 17.2: Bitwise Operations

    Like most languages based on C, -Python provides bitwise operations +Python provides bitwise operations for working directly with 1’s and 0’s in memory: & (and), | (or), @@ -561,7 +561,7 @@

    Section 17.2: Bitwise Operations

    ~ (not). & yields a 1 only if both its inputs are 1’s, while | yields 1 if either or both are 1. -^, called exclusive or or “xor” (pronounced “ex-or”), +^, called exclusive or or “xor” (pronounced “ex-or”), produces 1 if either but not both of its arguments are 1; putting it another way, ^ produces 0 if its inputs are the same and 1 if they are different. @@ -619,7 +619,7 @@

    Section 17.2: Bitwise Operations

    the other bits will be left as they are. Similarly, to set a bit to zero, -create a mask in which that bit is 0 and the others are 1, +create a mask in which that bit is 0 and the others are 1, then use & to combine the two. To make things easier to read, programmers often set a single bit, @@ -630,7 +630,7 @@

    Section 17.2: Bitwise Operations

    val = val & mask # clears this ^ bit
    -

    Python also has bit shifting operators +

    Python also has bit shifting operators that move bits left or right. Shifting the bits 0110 left by one place produces 1100, while shifting it right by one place produces 0011. @@ -658,9 +658,9 @@

    Section 17.2: Bitwise Operations

    you have to handle the top bit yourself.

    Section 17.3: Text

    The rules for storing text make integers look simple. -By the early 1970s most programs used ASCII, +By the early 1970s most programs used ASCII, which represented unaccented Latin characters using the numbers from 32 to 127. -(The numbers 0 to 31 were used for control codes +(The numbers 0 to 31 were used for control codes such as newline, carriage return, and bell.) Since computers use 8-bit bytes and the numbers 0–127 only need 7 bits, programmers were free to use the numbers 128–255 for other characters. @@ -669,10 +669,10 @@

    Section 17.3: Text

    non-Latin characters, graphic characters like boxes, and so on. -The chaos was eventually tamed by the ANSI standard +The chaos was eventually tamed by the ANSI standard which (for example) defined the value 231 to mean the character “ç”.

    A standard that specifies how characters are represented in memory -is called a character encoding. +is called a character encoding. Unfortunately, the encoding defined by the ANSI standard only solved a small part of a large problem. It didn’t include characters from Turkish, Devanagari, and many other alphabets, @@ -683,20 +683,20 @@

    Section 17.3: Text

  • existing text files using ANSI would have to be transcribed, and
  • documents would be two or four times larger.
  • -

    The solution was a new two-part standard called Unicode. -The first part defined a code point for every character: +

    The solution was a new two-part standard called Unicode. +The first part defined a code point for every character: U+0065 for an upper-case Latin “A”, U+2605 for a black star, and so on. (The Unicode Consortium site offers a complete list.)

    The second part defined ways to store these values in memory. -The simplest of these is UTF-32, +The simplest of these is UTF-32, which stores every character as a 32-bit number. This scheme waste a lot of memory if the text is written in a Western European language, since it uses four times as much storage as is absolutely necessary, but it’s easy to process.

    -

    The most popular encoding is UTF-8, -which is variable length. +

    The most popular encoding is UTF-8, +which is variable length. Every code point from 0 to 127 is stored in a single byte whose high bit is 0, just as it was in the original ASCII standard. If the top bit in the byte is 1, @@ -715,7 +715,7 @@

    Section 17.3: Text

    But that’s not all: every byte that’s a continuation of a character starts with the bits 10. -(Such bytes are, unsurprisingly, called continuation bytes.) +(Such bytes are, unsurprisingly, called continuation bytes.) This rule means that if we look at any byte in a string we can immediately tell if it’s the start of a character or the continuation of a character. @@ -795,7 +795,7 @@

    Section 17.4: And Now, Persistence

    put each value in a data structure that keeps track of its type along with a bit of extra administrative information (Figure 17.1). -Something stored this way is called a boxed value; +Something stored this way is called a boxed value; this extra data allows the language to do introspection and much more.

    Boxed values @@ -881,7 +881,7 @@

    Section 17.4: And Now, Persistence

    What is \x1f and why is it in our data? If Python finds a character in a string that doesn’t have a printable representation, -it prints a 2-digit escape sequence in hexadecimal. +it prints a 2-digit is used to represent a single " character inside a double-quoted string." markdown="1">escape sequence in hexadecimal. Python is therefore telling us that our string contains the eight bytes ['\x1f', '\x00', '\x00', '\x00', 'A', '\x00', '\x00', '\x00']. @@ -996,7 +996,7 @@

    Section 17.4: And Now, Persistence

    The unpacking function is analogous. -We break the memory buffer +We break the memory buffer into a header that’s exactly four bytes long (i.e., the right size for an integer) and a body made up of whatever’s left. @@ -1023,9 +1023,9 @@

    Section 17.4: And Now, Persistence

    hello!
     
    -

    This is called little-endian and is used by all Intel processors. +

    This is called little-endian and is used by all Intel processors. Some other processors put the most significant byte first, -which is called big-endian. +which is called big-endian. There are pro’s and con’s to both, which we won’t go into here. What you do need to know is that if you move data from one architecture to another, it’s your responsibility to flip the bytes around, diff --git a/docs/binary/slides/index.html b/docs/binary/slides/index.html index d3e21fb8d..c1a86c2e0 100644 --- a/docs/binary/slides/index.html +++ b/docs/binary/slides/index.html @@ -4,7 +4,7 @@ - + @@ -82,9 +82,9 @@

    Binary Data

    ## Hexadecimal -- More common to use hexadecimal (base 16) +- More common to use hexadecimal (base 16) - Digits are 0123456789ABCDEF -- Each digit represents 4 bits (a nybble) +- Each digit represents 4 bits (half a byte) ```py print(0x2D) # (2 * 16) + 13 @@ -100,7 +100,7 @@

    Binary Data

    ## Negative Numbers -- Could use sign and magnitude +- Could use sign and magnitude - `0100` is 4 - `1100` is -4 - But: @@ -111,7 +111,7 @@

    Binary Data

    ## Two's Complement -- Two's complement wraps around like an odometer +- Two's complement wraps around like an odometer
    @@ -256,7 +256,7 @@

    Binary Data

    - So Python uses two-digit hex representation `\xPQ` -- `\x00` is a null byte (value 0) +- `\x00` is a null byte (value 0) - Easy to miss the actual `A` between one `\x00` and the next @@ -266,7 +266,7 @@

    Binary Data

    ## Which Bytes Where -- Big-endian vs. little-endian ordering +- Big-endian vs. little-endian ordering
    Big-endian vs. little-endian @@ -379,7 +379,7 @@

    Binary Data

    ## Bytes and Text - ASCII originally defined 128 7-bit characters - - 0–31 were control codes + - 0–31 were control codes - Since bytes have 8 bits, programmers used the values 128–255 however they wanted - ANSI standard defined (for example) 23110 to be "ç" - But what about Turkish, Devanagari, kanji, hieroglyphics, …? @@ -391,12 +391,12 @@

    Binary Data

    ## Unicode -- Define a code point for every character +- Define a code point for every character - U+0065 for an upper-case Latin "A" - U+2605 for a black star ★ -- Define several character encodings +- Define several character encodings - UTF-32 uses 32 bits for every character -- Most popular is UTF-8 +- Most popular is UTF-8 - Code points 0–127 are stored in a single byte with a leading 0 - If the top bit is 1, the number of 1's tells UTF-8 how many bytes there are in the character @@ -409,7 +409,7 @@

    Binary Data

    - The next two bits mean "this is a three-byte character" - The first 0 separates the header from the start of the character - The final `1101` is the first four bits of the character -- Every continuation byte starts with `10` +- Every continuation byte starts with `10` - So we can tell if a byte is in the middle of a character --- @@ -435,7 +435,7 @@

    Binary Data

    - `open(filename, "r")` converts bytes to characters - And converts Windows line endings `\r\n` to Unix `\n` -- Use `open(filename, "rb")` to read in binary mode +- Use `open(filename, "rb")` to read in binary mode --- diff --git a/docs/bonus/index.html b/docs/bonus/index.html index 5f77a3d67..87d8e2524 100644 --- a/docs/bonus/index.html +++ b/docs/bonus/index.html @@ -4,7 +4,7 @@ - + @@ -348,7 +348,7 @@

    Appendix B: Bonus Material

    Each of the chapters in this book has been designed to fit into an hour (and a bit). This appendix presents material that extends core ideas -but would bust that attention budget.

    +but would bust that attention budget.

    Section B.1: Using Function Attributes

    This material extends Chapter 6.

    Since functions are objects, @@ -740,8 +740,8 @@

    Go To the Source

    please read [Goldberg1991] for more detail.

    Floating point numbers are represented by a sign, -a mantissa, -and an exponent. +a mantissa, +and an exponent. In a 32-bit word the IEEE 754 standard calls for 1 bit of sign, 23 bits for the mantissa, @@ -789,12 +789,12 @@

    Go To the Source

    This observation leads to a couple of useful definitions:

    • -

      The absolute error in an approximation +

      The absolute error in an approximation is the absolute value of the difference between the approximation and the actual value.

    • -

      The relative error +

      The relative error is the ratio of the absolute error to the absolute value we’re approximating.

    • diff --git a/docs/build/index.html b/docs/build/index.html index e9a52a848..1ec55d6f2 100644 --- a/docs/build/index.html +++ b/docs/build/index.html @@ -4,7 +4,7 @@ - + @@ -366,23 +366,23 @@

      Chapter 19: A Build Manager

      but normalize.py takes a long time to run. How can we regenerate the files we need, but only when we need them?

      -

      The standard answer is to use a build manager +

      The standard answer is to use a build manager to keep track of which files depend on which and what actions to take to create or update files. The first tool of this kind was Make, but many others now exist (such as Snakemake). -If a target is stale -with respect to any of its dependencies, -the builder manager runs a recipe to refresh it.

      +If a target is stale +with respect to any of its dependencies, +the builder manager runs a recipe to refresh it.

      The build manager runs recipes in an order that respects dependencies, and it only runs each recipe once (if at all). In order for this to be possible, -targets and dependencies must form a directed acyclic graph, +targets and dependencies must form a directed acyclic graph, i.e., -there cannot be a cycle of links +there cannot be a cycle of links leading from a node back to itself. The builder manager constructs -a topological ordering of that graph, +a topological ordering of that graph, i.e., arranges nodes so that each one comes after everything it depends on, and then builds what it needs to in that order @@ -394,12 +394,12 @@

      Chapter 19: A Build Manager

      A Bit of History

      -

      Make was created to manage programs in compiled languages +

      Make was created to manage programs in compiled languages like C and Java, which have to be translated into lower-level forms before they can run. There are usually two stages to the translation: compiling each source file into some intermediate form, -and then linking the compiled modules +and then linking the compiled modules to each other and to libraries to create a runnable program. If a source file hasn’t changed, @@ -517,7 +517,7 @@

      Section 19.3: Topological Sorting

    • If at any point the graph isn’t empty but nothing is available, - we have found a circular dependency, + we have found a circular dependency, so we report the problem and fail.

    • @@ -577,7 +577,7 @@

      Section 19.4: A Better Design

      to indicate a problem.

    • -

      Our topological sort isn’t stable, +

      Our topological sort isn’t stable, i.e., there’s no way to predict the order in which two “equal” nodes will be added to the ordering. @@ -728,7 +728,7 @@

      Section 19.4: A Better Design

      How We Actually Did It

      Our final design uses -the Template Method pattern: +the Template Method pattern: a method in a parent class defines the overall order of operations, while child class implement those operations without changing the control flow. @@ -738,7 +738,7 @@

      How We Actually Did It

      Instead, as we were trying to create a class that loaded and used timestamps, we did a bit of reorganizing in the parent class -to give ourselves the affordances we needed. +to give ourselves the affordances we needed. Software design almost always works this way: the first two or three times we try to use or extend something, we discover changes that would make those tasks easier. @@ -805,11 +805,11 @@

      Using Hashes

    • Dry Run

      -

      A dry run of a build shows the rules that would be executed +

      A dry run of a build shows the rules that would be executed but doesn’t actually execute them. Modify the build system in this chapter so that it can do dry runs.

      Phony Targets

      -

      A phony target is one that doesn’t correspond to a file. +

      A phony target is one that doesn’t correspond to a file. Developers often put phony targets in build files to give themselves an easy way to re-run tests, check code style, diff --git a/docs/build/slides/index.html b/docs/build/slides/index.html index 91f2a6e01..24537be7a 100644 --- a/docs/build/slides/index.html +++ b/docs/build/slides/index.html @@ -4,7 +4,7 @@ - + @@ -59,16 +59,16 @@

      A Build Manager

      ## Make -- Build managers +- Build managers keep track of which files depend on which - First tool of this kind was [Make][gnu_make] - Many others now exist (e.g., [Snakemake][snakemake]) -- If a target is stale - with respect to any of its dependencies, - run a recipe to refresh it +- If a target is stale + with respect to any of its dependencies, + run a recipe to refresh it - Run recipes in order @@ -78,9 +78,9 @@

      A Build Manager

      ## Terminology -- Targets and dependencies must form a directed acyclic graph +- Targets and dependencies must form a directed acyclic graph -- A topological ordering of a graph +- A topological ordering of a graph arranges nodes so that each one comes after everything it depends on
      @@ -232,7 +232,7 @@

      A Build Manager

      - So collect them and return an ordered list of commands 3. `assert` isn't a friendly way to handle user errors - Raise `ValueError` -4. Topological sort isn't stable +4. Topological sort isn't stable - `dict` is ordered but `set` is not - So sort node names when appending 5. We might want to add other keys to rules diff --git a/docs/check/index.html b/docs/check/index.html index c4b158226..b39997103 100644 --- a/docs/check/index.html +++ b/docs/check/index.html @@ -4,7 +4,7 @@ - + @@ -363,15 +363,15 @@

      Chapter 11: An HTML Validator

      Doing this prepares us for building a tool to generate pages (Chapter 12) and another to check the structure and style of our code (Chapter 13).

      Section 11.1: HTML and the DOM

      -

      An HTML document is made up of elements and text. +

      An HTML document is made up of elements and text. (It can actually contain other things, but we’ll ignore those for now.) -Elements are represented using tags enclosed in < and >. -An opening tag like <p> starts an element, -while a closing tag like </p> ends it. +Elements are represented using tags enclosed in < and >. +An opening tag like <p> starts an element, +while a closing tag like </p> ends it. If the element is empty, -we can use a self-closing tag like <br/> +we can use a self-closing tag like <br/> to save some typing. -Opening and self-closing tags can have attributes, +Opening and self-closing tags can have attributes, which are written as key="value".

      Tags must be properly nested, which has two consequences:

      @@ -380,7 +380,7 @@

      Section 11.1: HTML and the DOM

      Something like <a><b></a></b> is illegal.

    • -

      A document’s elements form a tree of nodes and text +

      A document’s elements form a tree of nodes and text like the one shown in Figure 11.1.

    • @@ -390,7 +390,7 @@

      Section 11.1: HTML and the DOM

      The objects that represent the nodes and text in an HTML tree -are called the Document Object Model or DOM. +are called the Document Object Model or DOM. Hundreds of tools have been written to convert HTML text to DOM; our favorite is a Python library called Beautiful Soup, which can handle messy real-world documents @@ -411,7 +411,7 @@

      Section 11.1: HTML and the DOM

      Tag nodes have two properties name and children to tell us what element the tag represents -and to give us access to the node’s children, +and to give us access to the node’s children, i.e., the nodes below it in the tree. We can therefore write a short recursive function @@ -545,7 +545,7 @@

      Section 11.2: The Visitor Pattern

      we should turn what we’ve learned into something reusable so that we never have to write it again. In this case, -we can use the Visitor design pattern. +we can use the Visitor design pattern. A visitor is a class that knows how to get to each element of a data structure and call a user-defined method when it gets there. Our visitor will have three methods: diff --git a/docs/check/slides/index.html b/docs/check/slides/index.html index 241ff7474..d6488c081 100644 --- a/docs/check/slides/index.html +++ b/docs/check/slides/index.html @@ -4,7 +4,7 @@ - + @@ -57,16 +57,16 @@

      An HTML Validator

      ## HTML as Text -- HTML documents contain tags and text +- HTML documents contain tags and text -- An opening tag like `

      ` starts an element +- An opening tag like `

      ` starts an element -- A closing tag like `

      ` ends the element +- A closing tag like `

      ` ends the element - If the element is empty, - we can use a self-closing tag like `
      ` + we can use a self-closing tag like `
      ` -- Opening and self-closing tags can have attributes +- Opening and self-closing tags can have attributes - Written as `key="value"` (with some variations) @@ -77,9 +77,9 @@

      An HTML Validator

      ## HTML as a Tree -- HTML elements form a tree of nodes and text +- HTML elements form a tree of nodes and text -- The object that represent these make up the Document Object Model (DOM) +- The object that represent these make up the Document Object Model (DOM)
      DOM tree @@ -260,7 +260,7 @@

      Main Title

      ## The Visitor Pattern -- A visitor is a class +- A visitor is a class that knows how to get to each element of a data structure - Derive a class of our own that does something for those elements @@ -350,7 +350,7 @@

      Main Title

      ## Find Style Violations -- Compare each parent-child combination against a manifest +- Compare each parent-child combination against a manifest ```yml body: diff --git a/docs/conduct/index.html b/docs/conduct/index.html index 23d81e6ea..f4863a67d 100644 --- a/docs/conduct/index.html +++ b/docs/conduct/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/contents/index.html b/docs/contents/index.html index 378c0ed5c..9d8e46e51 100644 --- a/docs/contents/index.html +++ b/docs/contents/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/contrib/index.html b/docs/contrib/index.html index 93f4fbe75..49c8741de 100644 --- a/docs/contrib/index.html +++ b/docs/contrib/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/credits/index.html b/docs/credits/index.html index f47a14c08..2270eadc9 100644 --- a/docs/credits/index.html +++ b/docs/credits/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/db/index.html b/docs/db/index.html index 7ba2e2dff..a6df2b12d 100644 --- a/docs/db/index.html +++ b/docs/db/index.html @@ -4,7 +4,7 @@ - + @@ -367,7 +367,7 @@

      Chapter 18: A Database

      to be able to get at our data, which might be easier if we choose a different storage format.

      This chapter therefore builds a very simple -log-structured database. +log-structured database. The phrase “log-structured” means that records a log of operations, i.e., every new record is appended to the end of the database. @@ -375,7 +375,7 @@

      Chapter 18: A Database

      but this is one of the easiest to understand.

      Section 18.1: Starting Point

      Our starting point is -a simple key-value store +a simple key-value store that lets us save records and look them up later. The user must provide a function that takes a record and returns its key:

      @@ -573,7 +573,7 @@

      Tradeoffs

      using the techniques of Chapter 17, but to make our test and sample output a little more readable, we will pack numbers as strings -with a null byte \0 between each string:

      +with a null byte \0 between each string:

      @staticmethod
       def pack(record):
      @@ -698,13 +698,13 @@ 

      Section 18.4: Playing With Blocks

      in the same way that we saved each version of a file in Chapter 10. However, this strategy won’t give us as much of a performance boost as we’d like. -The reason is that computers do file I/O in pages +The reason is that computers do file I/O in pages that are typically two or four kilobytes in size. If we want to read a single byte, the operating system actually reads a full page and then gives us just the byte we asked for.

      A more efficient strategy is therefore -to group records together in blocks, +to group records together in blocks, each of which is the same size as a page, and an index in memory to tell us which records are in which blocks. @@ -938,7 +938,7 @@

      Section 18.6: Cleaning Up

      to clean up blocks that are no longer needed because we have a more recent version of every record they contain. Reclaiming unused space this way is another form of -garbage collection. +garbage collection. Python and most other modern languages do it automatically to recycle unused memory, but it’s our responsibility to do it for the files our database creates.

      @@ -986,7 +986,7 @@

      Section 18.6: Cleaning Up

      Generate a new in-memory index.

      -

      This method doesn’t compact storage, +

      This method doesn’t compact storage, i.e., it doesn’t move records around to get rid of stale blocks within records. diff --git a/docs/db/slides/index.html b/docs/db/slides/index.html index 38d439870..302441171 100644 --- a/docs/db/slides/index.html +++ b/docs/db/slides/index.html @@ -4,7 +4,7 @@ - + @@ -52,13 +52,13 @@

      A Database

      - And interoperability across languages -- Create a simple log-structured database +- Create a simple log-structured database --- ## Starting Point -- A simple key-value store that lets us look things up +- A simple key-value store that lets us look things up - User must provide a function that gets key from record @@ -209,7 +209,7 @@

      A Database

      ## Refactor Database -- Corresponding change to use a static method +- Corresponding change to use a static method of the record class ```py @@ -266,7 +266,7 @@

      A Database

      ``` -- Save as strings with null byte `\0` between them +- Save as strings with null byte `\0` between them - A real implementation would pack as binary (Chapter 17) @@ -339,7 +339,7 @@

      A Database

      ``` -- Needs two helper methods +- Needs two helper methods --- @@ -368,9 +368,9 @@

      A Database

      ## Saving Blocks -- Save *N* records per block +- Save *N* records per block -- Keep the index in memory +- Keep the index in memory - When writing, only modify one block (smaller and faster) @@ -604,7 +604,7 @@

      A Database

      ## Next Steps -- Compact storage periodically +- Compact storage periodically - Use other data structures for indexing diff --git a/docs/debugger/index.html b/docs/debugger/index.html index c42cf3882..d50154d12 100644 --- a/docs/debugger/index.html +++ b/docs/debugger/index.html @@ -4,7 +4,7 @@ - + @@ -359,7 +359,7 @@

      Chapter 26: A Debugger

      We have finally come to another of the questions that sparked this book: -how does a debugger work? +how does a debugger work? Debuggers are as much a part of good programmers’ lives as version control but are taught far less often (in part, we believe, because it’s harder to create homework questions for them). @@ -444,7 +444,7 @@

      Section 26.1: One Step at a Time

      in which case it loops around and waits for something else.

    • -

      The user asks to disassemble the current instruction +

      The user asks to disassemble the current instruction or show the contents of memory, in which case it does that and loops around.

    • @@ -465,7 +465,7 @@

      Section 26.1: One Step at a Time

      The method that disassembles an instruction to show us what we’re about to do -checks a reverse lookup table +checks a reverse lookup table to create a printable representation of an instruction and its operands:

      def disassemble(self, addr, instruction):
      @@ -733,11 +733,11 @@ 

      Section 26.4: Breakpoints

      We would have to be pretty desperate to single-step through all of that even once, much less dozens of times as we’re exploring new ideas or trying new fixes. Instead, -we want to set a breakpoint +we want to set a breakpoint to tell the computer to stop at a particular location and drop us into the debugger. (We might even use -a conditional breakpoint +a conditional breakpoint that would only stop if (for example) the variable x was zero at that point, but we’ll leave that for the exercises.)

      @@ -758,7 +758,7 @@

      Section 26.4: Breakpoints

      we replace the instruction at that address with a breakpoint instruction and store the original instruction in a lookup table. If the user later -clears +clears the breakpoint, we copy the original instruction back into place, and if the VM encounters a breakpoint instruction while its running, @@ -923,7 +923,7 @@

      Conditional Breakpoints

      or explore ways of using eval to support the general case.)

      Watchpoints

      Modify the debugger and VM so that the user can create -watchpoints, +watchpoints, i.e., can specify that the debugger should halt the VM when the value at a particular address changes. diff --git a/docs/debugger/slides/index.html b/docs/debugger/slides/index.html index b828b099c..e5c4d4254 100644 --- a/docs/debugger/slides/index.html +++ b/docs/debugger/slides/index.html @@ -4,7 +4,7 @@ - + @@ -62,7 +62,7 @@

      A Debugger

      - We will want non-interactive input and output for testing -- So refactor the virtual machine of Chapter 25 +- So refactor the virtual machine of Chapter 25 - Pass an output stream (by default `sys.stdout`) @@ -110,7 +110,7 @@

      A Debugger

      - New one has a third state: single-stepping -- So define an enumeration +- So define an enumeration ```py class VMState(Enum): @@ -152,7 +152,7 @@

      A Debugger

      1. Empty line: go around again -2. Disassemble current instruction or show memory: +2. Disassemble current instruction or show memory: do that and go around again 3. Quit: @@ -365,9 +365,9 @@

      A Debugger

      ## Stop Here -- A breakpoint tells the computer to stop at a particular instruction +- A breakpoint tells the computer to stop at a particular instruction - - A conditional breakpoint stops if a condition is true + - A conditional breakpoint stops if a condition is true --- diff --git a/docs/dup/index.html b/docs/dup/index.html index d6aa4d0ec..7a24b2d90 100644 --- a/docs/dup/index.html +++ b/docs/dup/index.html @@ -4,7 +4,7 @@ - + @@ -403,7 +403,7 @@

      Section 3.1: Getting Started

      return left_bytes == right_bytes
      -

      Notice that the files are opened in binary mode +

      Notice that the files are opened in binary mode using "rb" instead of the usual "r". As we’ll see in Chapter 17, this tells Python to read the bytes as they are @@ -483,8 +483,8 @@

      Section 3.1: Getting Started

      \( N \) objects can be paired in \( N(N-1)/2 \) ways, so for large \( N \) the work is proportional to \( N^2 \). A computer scientist would say that -the time complexity of our algorithm is \( O(N^2) \), -which is pronounced “big-oh of N squared”. +the time complexity of our algorithm is \( O(N^2) \), +which is pronounced “big-oh of N squared”. If the number of files doubles, the running time roughly quadruples, which means means that the time per file increases as the number of files increases. @@ -507,8 +507,8 @@

      Section 3.2: Hashing Files

      Figure 3.2: Grouping by hash code reduces comparisons from 15 to 4.
      -

      We can construct IDs for files using a hash function -to produce a hash code. +

      We can construct IDs for files using a hash function +to produce a hash code. Since bytes are just numbers, we can create a very simple hash function by adding up the bytes in a file and taking the remainder modulo some number:

      @@ -608,7 +608,7 @@

      Section 3.3: Better Hashing

      so we can’t possibly do better than this, but how can we ensure that each unique file winds up in its own group?

      The answer is to use a -cryptographic hash function. +cryptographic hash function. The output is completely deterministic: given the same bytes in the same order, it will always produce the same output. @@ -621,10 +621,10 @@

      Section 3.3: Better Hashing

      Cryptographic hash functions are hard to write—or rather, it’s very hard to prove that a particular algorithm has the properties we require. We will therefore use a function from Python’s hashing library -that implements the SHA-256 algorithm. +that implements the SHA-256 algorithm. Given some bytes as input, this function produces a 256-bit hash, -which is normally written as a 64-character hexadecimal string. +which is normally written as a 64-character hexadecimal string. This uses the letters A-F (or a-f) to represent the digits from 10 to 15, so that (for example) 3D5 is \((3×16^2)+(13×16^1)+(5×16^0)\), or 981 in decimal:

      @@ -658,7 +658,7 @@

      The Birthday Problem

      there’s a 50% chance of two people sharing a birthday in a group of just 23 people, and a 99.9% chance with 70 people.

      The same math can tell us how many files we need to hash -before there’s a 50% chance of a collision with a 256-bit hash. +before there’s a 50% chance of a collision with a 256-bit hash. According to Wikipedia, the answer is approximately \( 4{\times}10^{38} \) files. We’re willing to take that risk…

      @@ -754,7 +754,7 @@

      Odds of Collision

      there’s only a 75% chance of any collision occurring. What are the actual odds?

      Streaming I/O

      -

      A streaming API delivers data one piece at a time +

      A streaming API delivers data one piece at a time rather than all at once. Read the documentation for the update method of hashing objects in Python’s hashing library diff --git a/docs/dup/slides/index.html b/docs/dup/slides/index.html index f956c9a69..74c90d882 100644 --- a/docs/dup/slides/index.html +++ b/docs/dup/slides/index.html @@ -4,7 +4,7 @@ - + @@ -101,7 +101,7 @@

      Finding Duplicate Files

      - But images, audio clips, etc. aren't character data -- So use open(filename, "rb") to open in binary mode +- So use open(filename, "rb") to open in binary mode - Look at the difference in more detail in Chapter 17 @@ -178,9 +178,9 @@

      Finding Duplicate Files

      - So for very large \\( N \\), work is proportional to \\( N^2 \\) -- Computer scientist would say "time complexity is \\( O(N^2) \\)" +- Computer scientist would say "time complexity is \\( O(N^2) \\)" - - Pronounced "big-oh of N squared" + - Pronounced "big-oh of N squared" - In practice, this means that the time per file increases as the number of files increases @@ -192,7 +192,7 @@

      Finding Duplicate Files

      - Process each file once to produce a short identifier -- I.e., use a hash function to produce a hash code +- I.e., use a hash function to produce a hash code - Only compare files with the same identifier @@ -311,7 +311,7 @@

      Finding Duplicate Files

      ## But We Can Do Better -- Use a cryptographic hash function +- Use a cryptographic hash function - Output is completely deterministic @@ -334,7 +334,7 @@

      Finding Duplicate Files

      - There's a 50% chance of two people sharing a birthday in a group of 23 people and a 99.9% chance with 70 people -- How many files do we need to hash before there's a 50% chance of a collision +- How many files do we need to hash before there's a 50% chance of a collision with a 256-bit hash? - Answer is "approximately \\( 4{\times}10^{38} \\) files" @@ -366,7 +366,7 @@

      Finding Duplicate Files

      ``` -- `hexdigest` gives hexadecimal representation of 256-bit hash code +- `hexdigest` gives hexadecimal representation of 256-bit hash code --- diff --git a/docs/finale/index.html b/docs/finale/index.html index 86495ce2a..cc0c7b6cf 100644 --- a/docs/finale/index.html +++ b/docs/finale/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/finale/slides/index.html b/docs/finale/slides/index.html index e7601891c..1d6b9dcc5 100644 --- a/docs/finale/slides/index.html +++ b/docs/finale/slides/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/ftp/index.html b/docs/ftp/index.html index 37dd6b03a..77b64333b 100644 --- a/docs/ftp/index.html +++ b/docs/ftp/index.html @@ -4,7 +4,7 @@ - + @@ -366,16 +366,16 @@

      Chapter 21: Transferring Files

      Section 21.1: Using TCP/IP

      Pretty much every program on the web runs on a family of communication standards called -Internet Protocol (IP). +Internet Protocol (IP). The one that concerns us is the -Transmission Control Protocol (TCP/IP), +Transmission Control Protocol (TCP/IP), which makes communication between computers look like reading and writing files.

      -

      Programs using IP communicate through sockets +

      Programs using IP communicate through sockets (Figure 21.1). Each socket is one end of a point-to-point communication channel, just like a phone is one end of a phone call. -A socket consists of an IP address that identifies a particular machine -and a port on that machine.

      +A socket consists of an IP address that identifies a particular machine +and a port on that machine.

      Sockets, IP addresses, and DNS
      Figure 21.1: How sockets, IP addresses, and DNS work together.
      @@ -383,7 +383,7 @@

      Section 21.1: Using TCP/IP

      The IP address consists of four 8-bit numbers, which are usually written as 93.184.216.34; -the Domain Name System (DNS) +the Domain Name System (DNS) matches these numbers to symbolic names like example.com that are easier for human beings to remember.

      A port is a number in the range 0-65535 @@ -394,8 +394,8 @@

      Section 21.1: Using TCP/IP

      custom applications should use the remaining ports (and should allow users to decide which port, since there’s always the chance that two different people will pick 1234 or 6789).

      -

      Most web applications consists of clients -and servers. +

      Most web applications consists of clients +and servers. A client program initiates communication by sending a message and waiting for a response; a server, on the other hand, @@ -525,7 +525,7 @@

      Section 21.2: Chunking

      with CHUNK_SIZE set to 1024. If the client sends more than a kilobyte of data, our server will ignore it. -This can result in deadlock because +This can result in deadlock because the server is trying to send its reply while the client is still trying to send the rest of the message. Increasing the size of the memory buffer @@ -735,7 +735,7 @@

      Section 21.3: Testing

      assert handler.request._sent == [bytes(f"{len(msg)}", "utf-8")]
      -

      The key to our approach is the notion of fidelity: +

      The key to our approach is the notion of fidelity: how close is what we test to what we use in production? In an ideal world they are exactly the same, but in cases like this it makes sense to sacrifice a little fidelity for testability’s sake.

      diff --git a/docs/ftp/slides/index.html b/docs/ftp/slides/index.html index 220a580c9..80d0ae4cd 100644 --- a/docs/ftp/slides/index.html +++ b/docs/ftp/slides/index.html @@ -4,7 +4,7 @@ - + @@ -56,11 +56,11 @@

      Transferring Files

      ## TCP/IP -- Most networked computers use Internet Protocol (IP) +- Most networked computers use Internet Protocol (IP) - Defines multiple layers on top of each other -- Transmission Control Protocol (TCP/IP) +- Transmission Control Protocol (TCP/IP) makes communication between computers look like reading and writing files @@ -68,13 +68,13 @@

      Transferring Files

      ## Sockets -- A sockets is one end of a point-to-point communication channel +- A sockets is one end of a point-to-point communication channel -- IP address identifies machine +- IP address identifies machine - Typically written as four 8-bit numbers like `93.184.216.34` -- port identifies a specific connection on that machine +- port identifies a specific connection on that machine - A number in the range 0–65535 @@ -92,17 +92,17 @@

      Transferring Files

      - And might actually identify a set of machines -- Domain Name System (DNS) translates names like `third-bit.com` +- Domain Name System (DNS) translates names like `third-bit.com` into numerical identifiers --- ## Clients and Servers -- A client sends requests and processes responses +- A client sends requests and processes responses (e.g., web browser) -- A server waits for requests and replies to them +- A server waits for requests and replies to them (e.g., a web server) --- @@ -197,7 +197,7 @@

      Transferring Files

      - What if client sends more data than that? -- Allocating a larger buffer just delays the problem +- Allocating a larger buffer just delays the problem - Better solution: keep reading until there is no more data @@ -284,7 +284,7 @@

      Transferring Files

      - Run the client - Shut down the server -- Better: use a mock object +- Better: use a mock object instead of a real network connection --- @@ -383,7 +383,7 @@

      Transferring Files

      ``` -- Trade fidelity for ease of use +- Trade fidelity for ease of use --- diff --git a/docs/func/index.html b/docs/func/index.html index b7c380403..ad4c5fa77 100644 --- a/docs/func/index.html +++ b/docs/func/index.html @@ -4,7 +4,7 @@ - + @@ -365,7 +365,7 @@

      Chapter 8: Functions and Closures

      One way to evaluate the design of a piece of software is -to ask how extensible it is, +to ask how extensible it is, i.e., how easily we can add or change things [Wilson2022a]. The answer for the interpreter of Chapter 7 is, “Pretty easily,” @@ -406,10 +406,10 @@

      Section 8.1: Definition and Storage

      Anonymity

      -

      A function without a name is called an anonymous. +

      A function without a name is called an anonymous. JavaScript makes heavy use of anonymous functions; Python supports a very limited version of them -using lambda expressions:

      +using lambda expressions:

      double = lambda x: 2 * x
       double(3)
      @@ -429,7 +429,7 @@ 

      Section 8.2: Calling Functions

      we need to implement scope so that parameters and variables used in a function don’t overwrite those defined outside it—in other words, -to prevent name collision. +to prevent name collision. When a function is called with one or more expressions as arguments, we will:

        @@ -460,8 +460,8 @@

        Section 8.2: Calling Functions

        Eager and Lazy

        We said above that we have to evaluate a function’s arguments when we call it, -which is called eager evaluation. -We could instead use lazy evaluation, +which is called eager evaluation. +We could instead use lazy evaluation, in which case we would pass the argument sub-lists into the function and let the function evaluate them when it needed their values. Python and most other languages use the former strategy, @@ -472,16 +472,16 @@

        Eager and Lazy

        To make this work, the environment must be a list of dictionaries instead of a single dictionary. -This list is the call stack of our program, -and each dictionary in it is usually called a stack frame. +This list is the call stack of our program, +and each dictionary in it is usually called a stack frame. When a function wants the value associated with a name, we look through the list from the most recent dictionary to the oldest.

        Scoping Rules

        Searching through all active stack frames for a variable -is called is dynamic scoping. +is called is dynamic scoping. In contrast, -most programming languages used lexical scoping, +most programming languages used lexical scoping, which figures out what a variable name refers to based on the structure of the program text.

        The completed implementation of function definition is:

        @@ -543,7 +543,7 @@

        Scoping Rules

        Once again, Python and other languages work exactly as shown here. The interpreter -(or the CPU, if we’re running code compiled to machine instructions) +(or the CPU, if we’re running code compiled to machine instructions) reads an instruction, figures out what operation it corresponds to, and executes that operation.

        @@ -567,9 +567,9 @@

        Section 8.3: Closures

        hidden thing is example
         
      -

      The inner function captures +

      The inner function captures the variables in the enclosing function -to create a closure. +to create a closure. Doing this is a way to make data private: once make_hidden returns _inner and we assign it to m in the example above, nothing else in our program can access diff --git a/docs/func/slides/index.html b/docs/func/slides/index.html index b1473f0c9..d4cc459a8 100644 --- a/docs/func/slides/index.html +++ b/docs/func/slides/index.html @@ -4,7 +4,7 @@ - + @@ -94,12 +94,12 @@

      Functions and Closures

      ## Anonymous Functions -- An anonymous functions +- An anonymous functions is one that doesn't have a name - JavaScript and other languages use them frequently -- Python supports limited lambda expressions +- Python supports limited lambda expressions ```py double = lambda x: 2 * x @@ -134,10 +134,10 @@

      Functions and Closures

      ## Eager and Lazy -- Eager evaluation: +- Eager evaluation: arguments are evaluated *before* call -- Lazy evaluation: +- Lazy evaluation: pass expression sub-lists into the function to be evaluated on demand - Gives the called function a chance to inspect or modify expressions @@ -157,9 +157,9 @@

      Functions and Closures

      a variable with the same name in its caller - Use a list of dictionaries to implement a - call stack + call stack -- Each dictionary called a stack frame +- Each dictionary called a stack frame - Look down the stack to find the name @@ -241,7 +241,7 @@

      Functions and Closures

      ## Dynamic Scoping - Searching active stack for a variable is called - dynamic scoping + dynamic scoping - Have to trace execution to figure out what a variable might refer to @@ -267,7 +267,7 @@

      Functions and Closures

      ## Lexical Scoping -- Almost all languages used lexical scoping +- Almost all languages used lexical scoping - Decide what a name refers to based on the structure of the program @@ -299,7 +299,7 @@

      Functions and Closures

      ``` -- The inner function captures +- The inner function captures the variables in the enclosing function - A way to make data private diff --git a/docs/glob/index.html b/docs/glob/index.html index 365a369e0..2bafdc127 100644 --- a/docs/glob/index.html +++ b/docs/glob/index.html @@ -4,7 +4,7 @@ - + @@ -369,7 +369,7 @@

      Chapter 4: Matching Patterns

      Early versions of Unix had a tool called glob to do this. The name was short for “global”, and older programmers (like this author) -still use the word globbing +still use the word globbing to mean “matching filenames against a pattern”. The Python standard library includes a module called glob to match filenames in the same way. @@ -386,7 +386,7 @@

      Chapter 4: Matching Patterns

      Section 4.1: Simple Patterns

      Globbing patterns are simpler than -the regular expressions +the regular expressions used to scrape data from text files. Our matcher will handle only the cases shown in Table 4.1.

      @@ -468,7 +468,7 @@

      Section 4.1: Simple Patterns

      We will therefore create objects to do matching rather than using bare functions.

      Our design uses -the Chain of Responsibility pattern. +the Chain of Responsibility pattern. Each object matches if it can, then delegates the rest of the match to the next object in the chain (Figure 4.2).

      @@ -540,7 +540,7 @@

      Section 4.1: Simple Patterns

      Test-Driven Development

      Some programmers write the tests for a piece of code before writing the code itself. -This practice is called test-driven development, +This practice is called test-driven development, and its advocates claim that it produces better code in less time because (a) writing tests helps people think about what the code should do before they’re committed to a particular implementation @@ -660,18 +660,18 @@

      Test-Driven Development

      Our Either matcher doesn’t handle rest properly. We can try to patch it using our current design, -but we’ve accumulated a bit of technical debt +but we’ve accumulated a bit of technical debt that we should clear up.

      Section 4.3: Rethinking

      We now have three matchers with the same interfaces. Before we do any further work, -we will refactor -using Extract Parent Class +we will refactor +using Extract Parent Class to eliminate duplicated code (Figure 4.3). Similarly, the test if self.rest is None appears several times. We can simplify this by creating a class to represent “nothing here”, -which is known as the Null Object pattern.

      +which is known as the Null Object pattern.

      Refactoring matchers
      Figure 4.3: Using the Extract Parent Class refactoring.
      @@ -695,7 +695,7 @@

      We Didn’t Invent This

      return result == len(text)
      -

      It assumes every child class has a _match method +

      It assumes every child class has a _match method that returns the location from which searching is to continue rather than just True or False. Match.match therefore checks that we’ve reached the end of the text.

      @@ -774,14 +774,14 @@

      We Didn’t Invent This

      Looping over the left and right alternative saves us from repeating code or introducing -a helper method. +a helper method. It also simplifies the handling of more than two options, which we explore in the exercises.

      Crucially, none of the existing tests change because none of the matching classes’ constructors changed -and the signature of the match method -(which they now inherit from the generic Match class) +and the signature of the match method +(which they now inherit from the generic Match class) stayed the same as well. We should (should) add a couple of tests for Null, but basically we have now met our original goal, @@ -847,11 +847,11 @@

      Returning Matches

      when *.txt matches name.txt, the library should return some indication that * matched the string "name".

      Alternative Matching

      -

      The tool we have built implements lazy matching, +

      The tool we have built implements lazy matching, i.e., the * character matches the shortest string it can that results in the overall pattern matching. -Modify the code to do greedy matching instead, +Modify the code to do greedy matching instead, and combine it with the solution to the previous exercise for testing.

      diff --git a/docs/glob/slides/index.html b/docs/glob/slides/index.html index 284258749..58edcd593 100644 --- a/docs/glob/slides/index.html +++ b/docs/glob/slides/index.html @@ -4,7 +4,7 @@ - + @@ -97,7 +97,7 @@

      Matching Patterns

      ## Design Patterns -- Use the Chain of Responsibility pattern +- Use the Chain of Responsibility pattern - Each object matches if it can… @@ -315,10 +315,10 @@

      Matching Patterns

      ## Rethinking - We now have three matchers with the same interfaces - - Refactor using - Extract Parent Class + - Refactor using + Extract Parent Class - The test `if self.rest is None` appears several times - - Use the Null Object pattern instead + - Use the Null Object pattern instead
      Refactoring matchers diff --git a/docs/glossary/index.html b/docs/glossary/index.html index 9a8afe73d..91bb946a5 100644 --- a/docs/glossary/index.html +++ b/docs/glossary/index.html @@ -4,7 +4,7 @@ - + @@ -349,7 +349,7 @@

      Appendix H: Glossary

      abstract base class
      An abstract class from which the class in question is derived.
      abstract class
      -
      A class that defines or requires methods it does not implement. An abstract class typically specifies the methods that child classes must have without providing default implementations.
      +
      A class that defines or requires methods it does not implement. An abstract class typically specifies the methods that child classes must have without providing default implementations.
      See also: concrete class.
      abstract method
      In object-oriented programming, a method that is defined but not implemented. Programmers will define an abstract method in a parent class to specify operations that child classes must provide.
      abstract syntax tree (AST)
      @@ -369,15 +369,15 @@

      Appendix H: Glossary

      ANSI character encoding
      An extension of ASCII that standardized the characters represented by the codes 128 to 255.
      Application Binary Interface (ABI)
      -
      The low-level layout that a piece of software must have to work on a particular kind of machine.
      +
      The low-level layout that a piece of software must have to work on a particular kind of machine.
      See also: Application Programming Interface.
      Application Programming Interface (API)
      -
      A set of functions provided by a software library or web service that other software can call.
      +
      A set of functions provided by a software library or web service that other software can call.
      See also: Application Binary Interface.
      argument
      -
      A value passed into a function or method call.
      +
      A value passed into a function or method call.
      See also: parameter.
      ASCII character encoding
      -
      A standard way to represent the characters commonly used in the Western European languages as 7-bit integers, now largely superceded by Unicode.
      +
      A standard way to represent the characters commonly used in the Western European languages as 7-bit integers, now largely superceded by Unicode.
      See also: ANSI character encoding.
      assembler
      -
      A compiler that translates software written in assembly code into machine instructions.
      +
      A compiler that translates software written in assembly code into machine instructions.
      See also: disassembler.
      assembly code
      A low-level programming language whose statements correspond closely to the actual instruction set of a particular kind of processor.
      assertion
      @@ -393,15 +393,15 @@

      Appendix H: Glossary

      backward-compatible
      A property of a system that enables interoperability with an older legacy system, or with input designed for such a system.
      base class
      -
      In object-oriented programming, a class from which other classes are derived.
      +
      In object-oriented programming, a class from which other classes are derived.
      See also: child class, derived class, parent class.
      batch processing
      Executing a set of non-interactive tasks on a computer, such as backing up files or copying data from one database to another overnight.
      benchmark
      A program or set of programs used to measure the performance of a computer system.
      big endian
      -
      A storage scheme in which the most significant part of a number is stored in the byte with the lowest address. For example, the 16-bit big-endian representation of 258 stores 0x01 in the lower byte and 0x02 in the higher byte.
      +
      A storage scheme in which the most significant part of a number is stored in the byte with the lowest address. For example, the 16-bit big-endian representation of 258 stores 0x01 in the lower byte and 0x02 in the higher byte.
      See also: little endian.
      big-oh notation
      -
      A way to express how the running time or memory requirements of an algorithm increase as the size of the problem increases.
      +
      A way to express how the running time or memory requirements of an algorithm increase as the size of the problem increases.
      See also: space complexity, time complexity.
      binary mode
      An option for reading or writing files in which each byte is transferred literally. The term is used in contrast with text mode.
      bit mask
      @@ -455,7 +455,7 @@

      Appendix H: Glossary

      class
      In object-oriented programming, a structure that combines data and operations (called methods). The program then uses a constructor to create an object with those properties and methods. Programmers generally put generic or reusable behavior in parent classes, and more detailed or specific behavior in child classes.
      class method
      -
      A function defined inside a class that takes the class object as an input rather than an instance of the class.
      +
      A function defined inside a class that takes the class object as an input rather than an instance of the class.
      See also: static method.
      clear (a breakpoint)
      To remove a breakpoint from a program.
      client
      @@ -471,7 +471,7 @@

      Appendix H: Glossary

      collision (in hashing)
      A situation in which two or more values have the same hash code.
      column-wise storage
      -
      To organize the memory of a two-dimensional table so that the values in each column are laid out in contiguous blocks.
      +
      To organize the memory of a two-dimensional table so that the values in each column are laid out in contiguous blocks.
      See also: row-wise storage.
      combinatorial explosion
      The exponential growth in the size of a problem or the time required to solve it that arises when all possible combinations of a set of items must be searched.
      Command pattern
      @@ -529,19 +529,19 @@

      Appendix H: Glossary

      defensive programming
      A set of programming practices that assumes mistakes will happen and either reports or corrects them, such as inserting assertions to report situations that are not ever supposed to occur.
      delayed construction
      -
      The practice of constructing an object after something that needs it has been constructed rather than before.
      +
      The practice of constructing an object after something that needs it has been constructed rather than before.
      See also: lazy evaluation.
      dependency (in build)
      Something that a build target depends on.
      derived class
      -
      In object-oriented programming, a class that is a direct or indirect extension of a base class.
      +
      In object-oriented programming, a class that is a direct or indirect extension of a base class.
      See also: child class.
      design by contract
      -
      A style of designing software in which functions specify the pre-conditions that must be true in order for them to run and the post-conditions they guarantee will be true when they return. A function can then be replaced by one with weaker pre-conditions (i.e., it accepts a wider set of input) and/or stronger post-conditions (i.e., it produces a smaller range of output) without breaking anything else.
      +
      A style of designing software in which functions specify the pre-conditions that must be true in order for them to run and the post-conditions they guarantee will be true when they return. A function can then be replaced by one with weaker pre-conditions (i.e., it accepts a wider set of input) and/or stronger post-conditions (i.e., it produces a smaller range of output) without breaking anything else.
      See also: Liskov Substitution Principle.
      design pattern
      -
      A recurring pattern in software design that is specific enough to be worth naming, but not so specific that a single best implementation can be provided by a library.
      +
      A recurring pattern in software design that is specific enough to be worth naming, but not so specific that a single best implementation can be provided by a library.
      See also: Iterator pattern, Singleton pattern, Template Method pattern, Visitor pattern.
      dictionary
      A data structure that allows items to be looked up by value. Dictionaries are often implemented using hash tables.
      dictionary comprehension
      -
      A single expression that constructs a dictionary by looping over key-value pairs.
      +
      A single expression that constructs a dictionary by looping over key-value pairs.
      See also: list comprehension.
      directed acyclic graph (DAG)
      A directed graph which does not contain any cycles (i.e., it is not possible to reach a node from itself by following edges).
      directed graph
      @@ -549,7 +549,7 @@

      Appendix H: Glossary

      disassemble
      To convert machine instructions into assembly code or some higher-level language.
      disassembler
      -
      A program that translates machine instructions into assembly code or some higher-level language.
      +
      A program that translates machine instructions into assembly code or some higher-level language.
      See also: assembler.
      docstring
      A string at the start of a module, class, or function in Python that is not assigned to a variable, which is used to hold the documentation for that part of code.
      DOM (DOM)
      @@ -567,9 +567,9 @@

      Appendix H: Glossary

      dynamic scoping
      To find the value of a variable by looking at what is on the call stack at the moment the lookup is done. Almost all programming languages use lexical scoping instead, since it is more predictable.
      dynamic typing
      -
      A system in which types are checked as the program is running.
      +
      A system in which types are checked as the program is running.
      See also: static typing, type hint.
      eager evaluation
      -
      Evaluating expressions before they are used.
      +
      Evaluating expressions before they are used.
      See also: lazy evaluation.
      easy mode
      A term borrowed from gaming meaning to do something with obstacles or difficulties simplified or removed, often for practice purposes.
      edge
      @@ -593,9 +593,9 @@

      Appendix H: Glossary

      expected result (of test)
      The value that a piece of software is supposed to produce when tested in a certain way, or the state in which it is supposed to leave the system.
      exponent
      -
      The portion of a floating-point number that controls placement of the decimal point.
      +
      The portion of a floating-point number that controls placement of the decimal point.
      See also: mantissa.
      expression
      -
      A part of a program that produces a value, such as 1+2.
      +
      A part of a program that produces a value, such as 1+2.
      See also: statement.
      extensibility
      How easily new features can be added to a program or existing features can be changed.
      Extract Parent Class refactoring
      @@ -605,11 +605,11 @@

      Appendix H: Glossary

      failure (result of test)
      A test fails if the actual result does not match the expected result.
      false negative
      -
      A report that something is missing when it is actually present.
      +
      A report that something is missing when it is actually present.
      See also: false positive.
      false positive
      -
      A report that something is present when it is actually absent.
      +
      A report that something is present when it is actually absent.
      See also: false negative.
      falsy
      -
      Refers to a value that is treated as false in Boolean expressions. In Python, this includes empty strings and lists and the number zero.
      +
      Refers to a value that is treated as false in Boolean expressions. In Python, this includes empty strings and lists and the number zero.
      See also: truthy.
      field
      A component of a record containing a single value. Every record in a database table has the same fields.
      file locking
      @@ -621,11 +621,11 @@

      Appendix H: Glossary

      generic function
      A collection of functions with similar purpose, each operating on a different class of data.
      globbing
      -
      Matching filenames against patterns. The name comes from an early Unix utility called glob (short for “global”). Glob patterns are a subset of regular expressions.
      +
      Matching filenames against patterns. The name comes from an early Unix utility called glob (short for “global”). Glob patterns are a subset of regular expressions.
      See also: regular expression.
      graph (data structure)
      -
      A data structure in which nodes are connected to one another by edges.
      +
      A data structure in which nodes are connected to one another by edges.
      See also: tree.
      greedy matching
      -
      Matching as much as possible while still finding a valid match.
      +
      Matching as much as possible while still finding a valid match.
      See also: lazy matching.
      hash code
      A value generated by a hash function. Good hash codes have the same properties as random numbers in order to reduce the frequency of collisions.
      hash function
      @@ -643,15 +643,15 @@

      Appendix H: Glossary

      helper method
      A method designed to be used only by other methods in the same class. Helper methods are usually created to keep other methods short and readable.
      heterogeneous
      -
      Containing mixed data types. For example, an array in Javascript can contain a mix of numbers, character strings, and values of other types.
      +
      Containing mixed data types. For example, an array in Javascript can contain a mix of numbers, character strings, and values of other types.
      See also: homogeneous.
      hexadecimal
      A base-16 numerical representation that uses the letters A-F (or a-f) to represent the values from 10 to 15.
      homogeneous
      -
      Containing a single data type. For example, a vector must be homogeneous: its values must all be numeric, logical, etc.
      +
      Containing a single data type. For example, a vector must be homogeneous: its values must all be numeric, logical, etc.
      See also: heterogeneous.
      hostname
      The human-readable name for a networked computer, such as example.com.
      HTML (HyperText Markup Language)
      -
      The standard markup language used for web pages. HTML is represented in memory using DOM (Digital Object Model).
      +
      The standard markup language used for web pages. HTML is represented in memory using DOM (Digital Object Model).
      See also: XML.
      HTTP (HyperText Transfer Protocol)
      The protocol used to exchange information between browsers and websites, and more generally between other clients and servers. Communication consists of requests and responses.
      HTTP method
      @@ -669,9 +669,9 @@

      Appendix H: Glossary

      index (a database)
      An auxiliary data structure in a database used to speed up search for some entries. An index increases memory and disk requirements but reduces search time.
      infix notation
      -
      Writing expressions with operators between operands, as in 1 + 2 to add 1 and 2.
      +
      Writing expressions with operators between operands, as in 1 + 2 to add 1 and 2.
      See also: prefix notation, postfix notation.
      inheritance
      -
      The act of creating a new class from an existing class, typically by adding or changing its properties or methods.
      +
      The act of creating a new class from an existing class, typically by adding or changing its properties or methods.
      See also: multiple inheritance.
      instance
      An object of a particular class.
      instruction pointer
      @@ -695,7 +695,7 @@

      Appendix H: Glossary

      iterator
      A function or object that produces each value from a collection in turn for processing.
      Iterator pattern
      -
      A design pattern that uses iterators to hide the differences between different kinds of data structures so that everything can be processed using loops.
      +
      A design pattern that uses iterators to hide the differences between different kinds of data structures so that everything can be processed using loops.
      See also: Visitor pattern.
      join (tables)
      An operation that combines two tables, typically by matching keys from one with keys from another.
      JSON (JavaScript Object Notation)
      @@ -711,9 +711,9 @@

      Appendix H: Glossary

      layout engine
      A piece of software that decides where to place text, images, and other elements on a page.
      lazy evaluation
      -
      Evaluating expressions only when absolutely necessary.
      +
      Evaluating expressions only when absolutely necessary.
      See also: eager evaluation.
      lazy matching
      -
      Matching as little as possible while still finding a valid match.
      +
      Matching as little as possible while still finding a valid match.
      See also: greedy matching.
      lexical scoping
      To look up the value associated with a name according to the textual structure of a program. Most programming languages use lexical scoping instead of dynamic scoping because the latter is less predictable.
      library
      @@ -731,7 +731,7 @@

      Appendix H: Glossary

      literal (in parsing)
      A representation of a fixed value in a program, such as the digits 123 for the number 123 or the characters "abc" for the string containing those three letters.
      little endian
      -
      A storage scheme in which the most significant part of a number is stored in the byte with the highest address. For example, the 16-bit big-endian representation of 258 stores 0x02 in the lower byte and 0x01 in the higher byte.
      +
      A storage scheme in which the most significant part of a number is stored in the byte with the highest address. For example, the 16-bit big-endian representation of 258 stores 0x02 in the lower byte and 0x01 in the higher byte.
      See also: big endian.
      log file
      A file to which a program writes status or debugging information for later analysis.
      log-structured database
      @@ -739,11 +739,11 @@

      Appendix H: Glossary

      manifest
      A list of something’s parts or components.
      mantissa
      -
      The portion of a floating-point number that defines its specific value.
      +
      The portion of a floating-point number that defines its specific value.
      See also: exponent.
      Markdown
      A markup language with a simple syntax intended as a replacement for HTML.
      markup language
      -
      A set of rules for annotating text to define its meaning or how it should be displayed. The markup is usually not displayed, but instead controls how the underlying text is interpreted or shown. Markdown and HTML are widely-used markup languages for web pages.
      +
      A set of rules for annotating text to define its meaning or how it should be displayed. The markup is usually not displayed, but instead controls how the underlying text is interpreted or shown. Markdown and HTML are widely-used markup languages for web pages.
      See also: XML.
      metadata
      Data about data, such as the time a dataset was archived.
      method
      @@ -793,7 +793,7 @@

      Appendix H: Glossary

      page
      A fixed-size block of storage space. Most modern filesystems manage disks using 4K pages, and many other applications such as databases use the same page size to maximize efficiency.
      parameter
      -
      The name that a function gives to one of the values passed to it when it is called.
      +
      The name that a function gives to one of the values passed to it when it is called.
      See also: argument.
      parameter sweeping
      To execute a program multiple times with different parameters to find out how its behavior or performance depends on those parameters.
      parent (in a tree)
      @@ -819,19 +819,19 @@

      Appendix H: Glossary

      port
      A logical endpoint for communication, like a phone number in an office building. Only one program on a computer may use a particular port on that computer at any time.
      post-condition
      -
      Something that is guaranteed to be true after a function runs successfully. Post-conditions are often expressed as assertions that are guaranteed to be be true of a function’s results.
      +
      Something that is guaranteed to be true after a function runs successfully. Post-conditions are often expressed as assertions that are guaranteed to be be true of a function’s results.
      See also: design by contract, pre-condition.
      postfix notation
      -
      Writing expressions with the operator after the operand, as in 2 3 + to add 2 and 3.
      +
      Writing expressions with the operator after the operand, as in 2 3 + to add 2 and 3.
      See also: infix notation, prefix notation.
      pre-condition
      -
      Something that must be true before a function runs in order for it to work correctly. Pre-conditions are often expressed as as assertions that must be true of a function’s inputs in order for it to run successfully.
      +
      Something that must be true before a function runs in order for it to work correctly. Pre-conditions are often expressed as as assertions that must be true of a function’s inputs in order for it to run successfully.
      See also: design by contract, post-condition.
      prefix notation
      -
      Writing expressions with the operator in front of the operand, as in + 3 4 to add 3 and 4.
      +
      Writing expressions with the operator in front of the operand, as in + 3 4 to add 3 and 4.
      See also: infix notation, postfix notation.
      prerequisite
      -
      Something that a build target depends on.
      +
      Something that a build target depends on.
      See also: dependency (in build).
      profiler
      A tool that measures one or more aspects of a program’s performance.
      profiling
      -
      The act of measuring where a program spends it time, which operations consume memory or disk space, etc.
      +
      The act of measuring where a program spends it time, which operations consume memory or disk space, etc.
      See also: profiler.
      promise
      A placeholder representing the eventual result of an asynchronous computation.
      protocol
      @@ -843,7 +843,7 @@

      Appendix H: Glossary

      race condition
      A situation in which a result depends on the order in which two or more concurrent operations are carried out.
      raise (an exception)
      -
      To signal that something unexpected or unusual has happened in a program by creating an exception and handing it to the error-handling system, which then tries to find a point in the program that will catch it.
      +
      To signal that something unexpected or unusual has happened in a program by creating an exception and handing it to the error-handling system, which then tries to find a point in the program that will catch it.
      See also: throw exception.
      record
      A group of related values that are stored together. A record may be represented as a tuple or as a row in a table; in the latter case, every record in the table has the same fields.
      recursion
      @@ -859,7 +859,7 @@

      Appendix H: Glossary

      regular expression
      A pattern for matching text, written as text itself. Regular expressions are sometimes called “regexp”, “regex”, or “RE”, and are powerful tools for working with text.
      relational database
      -
      A database that organizes information into tables, each of which has a fixed set of named fields (shown as columns) and a variable number of records (shown as rows).
      +
      A database that organizes information into tables, each of which has a fixed set of named fields (shown as columns) and a variable number of records (shown as rows).
      See also: SQL.
      relative error
      The absolute value of the difference between the actual and correct value divided by the correct value. For example, if the actual value is 9 and the correct value is 10, the relative error is 0.1. Relative error is usually more useful than absolute error.
      reverse lookup
      @@ -867,9 +867,11 @@

      Appendix H: Glossary

      root (in a tree)
      The node in a tree of which all other nodes are direct or indirect children, or equivalently the only node in the tree that has no parent.
      row-wise storage
      -
      To organize the memory of a two-dimensional table so that the values in each row are laid out in contiguous blocks.
      +
      To organize the memory of a two-dimensional table so that the values in each row are laid out in contiguous blocks.
      See also: column-wise storage.
      runtime
      A program that implements the basic operations used in a programming language.
      +
      sandbox
      +
      A space where code can execute safely.
      schema
      A specification of the format of a dataset, including the name, format, and content of each table.
      scope
      @@ -887,19 +889,19 @@

      Appendix H: Glossary

      SHA-256 (hash function)
      A cryptographic hash function that produces a 256-bit output.
      sign and magnitude
      -
      A binary representation of integers in which one bit indicates whether the value is positive or negative and the remaining bits indicate its magnitude.
      +
      A binary representation of integers in which one bit indicates whether the value is positive or negative and the remaining bits indicate its magnitude.
      See also: two’s complement.
      signature
      The ordered list of parameters and return values that specifies how a function must be called and what it returns.
      single stepping
      To step through a program one line or instruction at a time.
      singleton
      -
      A set with only one element, or a class with only one instance.
      +
      A set with only one element, or a class with only one instance.
      See also: Singleton pattern.
      Singleton pattern
      A design pattern that creates a singleton object to manage some resource or service, such as a database or cache. In object-oriented programming, the pattern is usually implemented by hiding the constructor of the class in some way so that it can only be called once.
      socket
      A communication channel between two computers that provides an interface similar to reading and writing files.
      space complexity
      -
      The way the memory required by an algorithm grows as a function of the problem size, usually expressed using big-oh notation.
      +
      The way the memory required by an algorithm grows as a function of the problem size, usually expressed using big-oh notation.
      See also: time complexity.
      spread
      To automatically match the values from a list or dictionary supplied by the caller to the parameters of a function.
      SQL
      @@ -911,19 +913,19 @@

      Appendix H: Glossary

      stale (in build)
      To be out-of-date compared to a prerequisite. A build manager finds and updates things that are stale.
      standard error
      -
      A predefined communication channel typically used to report errors.
      +
      A predefined communication channel typically used to report errors.
      See also: standard input, standard output.
      standard input
      -
      A predefined communication channel typically used to read input from the keyboard or from the previous process in a pipe.
      +
      A predefined communication channel typically used to read input from the keyboard or from the previous process in a pipe.
      See also: standard error, standard output.
      standard output
      -
      A predefined communication channel typically used to send output to the screen or to the next process in a pipe.
      +
      A predefined communication channel typically used to send output to the screen or to the next process in a pipe.
      See also: standard error, standard input.
      statement
      -
      A part of a program that doesn’t produce a value. for loops and if statements are statements in Python.
      +
      A part of a program that doesn’t produce a value. for loops and if statements are statements in Python.
      See also: expression.
      static method
      -
      A function that is defined within a class but does not require either the class itself or an instance of the class as a parameter.
      +
      A function that is defined within a class but does not require either the class itself or an instance of the class as a parameter.
      See also: class method.
      static site generator
      A software tool that creates HTML pages from templates and content.
      static typing
      -
      A system in which the types of values are checked as code is being compiled.
      +
      A system in which the types of values are checked as code is being compiled.
      See also: dynamic typing, type hint.
      stream
      A sequence of bytes or other data of variable length that can only be processed in sequential order.
      streaming API
      @@ -955,7 +957,7 @@

      Appendix H: Glossary

      throw low, catch high
      A widely-used pattern for managing exceptions whereby they are raised in many places at low levels of a program but caught in a few high-level places where corrective action can be taken.
      time complexity
      -
      The way the running time of an algorithm grows as a function of the problem size, usually expressed using big-oh notation.
      +
      The way the running time of an algorithm grows as a function of the problem size, usually expressed using big-oh notation.
      See also: space complexity.
      time of check - time of use
      A race condition in which a process checks the state of something and then operates on it, but some other process might alter that state between the check and the operation.
      timestamp
      @@ -963,7 +965,7 @@

      Appendix H: Glossary

      token
      An indivisible unit of text for a parser, such as a variable name or a number. Exactly what constitutes a token depends on the language.
      top-down design
      -
      In software design, the practice of writing the more abstract or higher-level parts of the program first, then filling in the details layer by layer. In practice, programmers almost always modify the upper levels as they work on the lower levels, but high-level changes become less common as more of the details are filled in.
      +
      In software design, the practice of writing the more abstract or higher-level parts of the program first, then filling in the details layer by layer. In practice, programmers almost always modify the upper levels as they work on the lower levels, but high-level changes become less common as more of the details are filled in.
      See also: successive refinement.
      topological order
      Any ordering of the nodes in a graph that respects the direction of its edges, i.e., if there is an edge from node A to node B, A comes before B in the ordering. There may be many topological orderings of a particular graph.
      Transmission Control Protocol (TCP/IP)
      @@ -971,13 +973,13 @@

      Appendix H: Glossary

      tree
      A graph in which every node except the root has exactly one parent.
      truthy
      -
      Refers to a value that is treated as true in Boolean expressions. In Python, this includes non-empty strings and lists and numbers other than zero.
      +
      Refers to a value that is treated as true in Boolean expressions. In Python, this includes non-empty strings and lists and numbers other than zero.
      See also: falsy.
      tuple
      A value that has a fixed number of parts, such as the three color components of a red-green-blue color specification.
      two hard problems in computer science
      Refers to a quote by Phil Karlton: “There are only two hard problems in computer science—cache invalidation and naming things.” Many variations add a third problem as a joke, such as off-by-one errors.
      two’s complement
      -
      A binary representation of integers that “rolls over” like an odometer to represent negative values.
      +
      A binary representation of integers that “rolls over” like an odometer to represent negative values.
      See also: sign and magnitude.
      type hint
      Extra information added to a program to indicate what data type or types a variable is supposed to have. Type hints are a compromise between static typing and dynamic typing.
      Unicode
      @@ -1005,7 +1007,7 @@

      Appendix H: Glossary

      virtual machine
      A program that pretends to be a computer. This may seem a bit redundant, but VMs are quick to create and start up, and changes made inside the virtual machine are contained within that VM so we can install new packages or run a completely different operating system without affecting the underlying computer.
      Visitor pattern
      -
      A design pattern in which the operation to be done is taken to each element of a data structure in turn. It is usually implemented by having a generator “visitor” that knows how to reach the structure’s elements, which is given a function or method to call for each in turn, and that carries out the specific operation.
      +
      A design pattern in which the operation to be done is taken to each element of a data structure in turn. It is usually implemented by having a generator “visitor” that knows how to reach the structure’s elements, which is given a function or method to call for each in turn, and that carries out the specific operation.
      See also: Iterator pattern.
      watchpoint
      A location or variable being monitored by a debugger. If the value at that location or in that variable changes, the debugger halts and gives the user a chance to inspect the program.
      word (of memory)
      diff --git a/docs/http/index.html b/docs/http/index.html index ac20a341f..eab8d8f7c 100644 --- a/docs/http/index.html +++ b/docs/http/index.html @@ -4,7 +4,7 @@ - + @@ -367,12 +367,12 @@

      Chapter 22: Serving Web Pages

      Instead, we would like to use a single standardized protocol in a variety of ways.

      The Hypertext Transfer Protocol, -more commonly called HTTP, +more commonly called HTTP, specifies one way programs can exchange data over IP. HTTP is deliberately simple: -the client sends a request +the client sends a request specifying what it wants over a socket connection, -and the server sends a response containing some data. +and the server sends a response containing some data. A server can construct responses however it wants; it can copy a file from disk, generated HTML dynamically, @@ -380,9 +380,9 @@

      Chapter 22: Serving Web Pages

      An HTTP request is that it’s just text: any program that wants to can create one or parse one. An absolutely minimal HTTP request has just -a method, -a URL, -and a protocol version +a method, +a URL, +and a protocol version on a single line separated by spaces:

      GET /index.html HTTP/1.1
      @@ -398,7 +398,7 @@ 

      Chapter 22: Serving Web Pages

      The HTTP version is usually “HTTP/1.0” or “HTTP/1.1”; the differences between the two don’t matter to us.

      Most real requests have a few extra lines called -headers, +headers, which are key value pairs like the three shown below:

      GET /index.html HTTP/1.1
      @@ -412,7 +412,7 @@ 

      Chapter 22: Serving Web Pages

      so that (for example) a request can specify that it’s willing to accept several types of content.

      Finally, -the body of the request is any extra data associated with it, +the body of the request is any extra data associated with it, such as form data or uploaded files. There must be a blank line between the last header and the start of the body to signal the end of the headers, @@ -421,7 +421,7 @@

      Chapter 22: Serving Web Pages

      that tells the server how many bytes to read.

      An HTTP response is formatted like an HTTP request. Its first line has the protocol, -a status code +a status code like 200 for “OK” or 404 for “Not Found”, and a status phrase (e.g., the word “OK”). There are then some headers, @@ -589,7 +589,7 @@

      Section 22.2: Serving Files

      We first turn the path in the URL into a local file path. -(We assume that all paths are resolved +(We assume that all paths are resolved relative to the directory that the server is running in.) If that path corresponds to a file, we send it back to the client. If nothing is there, @@ -599,7 +599,7 @@

      Section 22.2: Serving Files

      but the latter has an advantage: we can handle errors that occur inside methods we’re calling (like handle_file) in the same place and in the same way as we handle errors that occur here. -This approach is sometimes called throw low, catch high, +This approach is sometimes called throw low, catch high, which means that errors should be flagged in many places but handled in a few places high up in the code. The method that handles files is an example of this:

      @@ -810,7 +810,7 @@

      Parsing HTTP Requests

      protocol version, and headers.

      Query Parameters

      -

      A URL can contain query parameters. +

      A URL can contain query parameters. Read the documentation for the urlparse module and then modify the file server example so that a URL containing a query parameter bytes=N diff --git a/docs/http/slides/index.html b/docs/http/slides/index.html index fc740b2c9..4f0ad3e01 100644 --- a/docs/http/slides/index.html +++ b/docs/http/slides/index.html @@ -4,7 +4,7 @@ - + @@ -49,7 +49,7 @@

      Serving Web Pages

      - Uploading and downloading files (Chapter 21) is useful, but we want to do more -- Don't want to create a new protocol for every interaction +- Don't want to create a new protocol for every interaction - Use a standard protocol in a variety of ways @@ -57,13 +57,13 @@

      Serving Web Pages

      ## HTTP -- Hypertext Transfer Protocol (HTTP) specifies +- Hypertext Transfer Protocol (HTTP) specifies what kinds of messages clients and servers can exchange and how those messages are formatted -- Client sends a request as text over a socket connection +- Client sends a request as text over a socket connection -- Server replies with a response (also text) +- Server replies with a response (also text) - Requests and responses may carry (non-textual) data with them @@ -73,11 +73,11 @@

      Serving Web Pages

      ## HTTP Requests -- A method such as `GET` or `POST` +- A method such as `GET` or `POST` -- A URL +- A URL -- A protocol version +- A protocol version ```txt GET /index.html HTTP/1.1 @@ -88,7 +88,7 @@

      Serving Web Pages

      ## Headers -- Requests may also have headers +- Requests may also have headers ```txt GET /index.html HTTP/1.1 @@ -106,7 +106,7 @@

      Serving Web Pages

      - Protocol and version -- A status code and phrase +- A status code and phrase - Headers, possibly including `Content-Length` (in bytes) @@ -203,7 +203,7 @@

      Hello, World!

      - Browser shows page -- Shell shows log messages +- Shell shows log messages ``` 127.0.0.1 - - [16/Sep/2022 06:34:59] "GET / HTTP/1.1" 200 - @@ -238,7 +238,7 @@

      Hello, World!

      - Translate path in URL into path to local file -- Resolve paths relative to server's directory +- Resolve paths relative to server's directory ```py def handle_file(self, given_path, full_path): @@ -276,20 +276,20 @@

      Error accessing {path}: {msg}

      - Use `try`/`except` to handle errors in called methods -- Throw low, catch high +- Throw low, catch high --- ## Problems -- Client can escape from our sandbox +- Client can escape from our sandbox by asking for `http://localhost:8080/../../passwords.txt` - `send_content` always says it is returning HTML with `Content-Type` - Should use things like `image/png` for images -- But we got character encoding right +- But we got character encoding right --- @@ -317,7 +317,7 @@

      Error accessing {path}: {msg}

      ## Combining Code -- Use multiple inheritance +- Use multiple inheritance
      Testing class hierarchy diff --git a/docs/index.html b/docs/index.html index 55ba4cc76..4c1f63a46 100644 --- a/docs/index.html +++ b/docs/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/interp/index.html b/docs/interp/index.html index e95007cf0..1c08252f5 100644 --- a/docs/interp/index.html +++ b/docs/interp/index.html @@ -4,7 +4,7 @@ - + @@ -365,14 +365,14 @@

      Chapter 7: An Interpreter

      Chapter 2 introduced the idea that functions, objects, and classes are just data, while Chapter 6 showed how Python itself manages them. Similarly, -the compilers and interpreters +the compilers and interpreters that make programs run are just programs themselves. Instead of changing the characters in a block of memory like text editors, or calculating sums and averages like spreadsheets, compilers turn text into instructions for interpreters or hardware to run.

      Most real programming languages have two parts: a parser that translates the source code into a data structure in memory, -and a runtime that executes the instructions in that data structure. +and a runtime that executes the instructions in that data structure. Chapter 5 explored parsing; this chapter will build a runtime for a very simple interpreter, while Chapter 25 will look at compiling code for more efficient execution.

      @@ -405,9 +405,9 @@

      Section 7.2: Expressions

      Notation

      -

      We use infix notation like 1+2 for historical reasons +

      We use infix notation like 1+2 for historical reasons in everyday life, -but our programs use prefix notation—i.e., +but our programs use prefix notation—i.e., they always put the operations’ names first—to make the operations easier to find. Similarly, we have special symbols for addition, subtraction, and so on for historical reasons, @@ -447,8 +447,8 @@

      Notation

      Arguments vs. Parameters

      -

      Many programmers use the words argument -and parameter interchangeably, +

      Many programmers use the words argument +and parameter interchangeably, but to make our meaning clear, we call the values passed into a function its arguments and the names the function uses to refer to them as its parameters. @@ -514,7 +514,7 @@

      Arguments vs. Parameters

      Our program is a list of lists (of lists…) -so we can read it as JSON using json.load +so we can read it as JSON using json.load rather than writing our own parser. If our program file contains:

      @@ -545,7 +545,7 @@

      Section 7.3: Variables

      that let us give names to values. We can add them to our interpreter by passing around a dictionary containing all the variables seen so far. -Such a dictionary is sometimes called an environment +Such a dictionary is sometimes called an environment because it is the setting in which expressions are evaluated; the dictionaries returned by the globals and locals functions introduced in Chapter 6 are both environments.

      @@ -587,7 +587,7 @@

      Section 7.3: Variables

      and then use them in calculations. To handle this, we add a function do_seq that runs a sequence of expressions one by one. -This function is our first piece of control flow: +This function is our first piece of control flow: rather than calculating a value itself, it controls when and how other expressions are evaluated. Its implementation is:

      @@ -616,8 +616,8 @@

      Section 7.3: Variables

      Everything Is An Expression

      -

      Python distinguishes expressions that produce values -from statements that don’t. +

      Python distinguishes expressions that produce values +from statements that don’t. But it doesn’t have to, and many languages don’t. For example, Python could have been designed to allow this:

      @@ -672,7 +672,7 @@

      Section 7.4: Introspection Again

      Line by line:

      1. -

        We use a dictionary comprehension +

        We use a dictionary comprehension to create a dictionary in a single statement.

      2. @@ -798,7 +798,7 @@

        For Loops

        Your implementation should allow users to specify a loop variable so that they know which iteration of the loop they’re in.

        Internal Checks

        -

        Defensive programming is an approach to software development +

        Defensive programming is an approach to software development that starts from the assumption that people make mistakes and should therefore put checks in their code to catch “impossible” situations. These checks are typically implement as assert statements @@ -809,7 +809,7 @@

        Internal Checks

        What other assertions could we add to this code?

      3. -

        How many of these checks can be implemented as type hints instead?

        +

        How many of these checks can be implemented as type hints instead?

      diff --git a/docs/interp/slides/index.html b/docs/interp/slides/index.html index 89d9f189d..2886f3d5a 100644 --- a/docs/interp/slides/index.html +++ b/docs/interp/slides/index.html @@ -4,7 +4,7 @@ - + @@ -51,7 +51,7 @@

      An Interpreter

      - Compiler: generate instructions once in advance - Interpreter: generate instructions on the fly - Differences are increasingly blurry in practice -- Most have a parser and a runtime +- Most have a parser and a runtime - Look at the latter in this lesson to see how programs actually run --- @@ -69,7 +69,7 @@

      An Interpreter

      -- -- We use special infix notation like `1+2` for historical reasons +- We use special infix notation like `1+2` for historical reasons - Always putting the operation first makes processing easier --- @@ -108,7 +108,7 @@

      An Interpreter

      ## Dispatching Operations -- Write a function that dispatches to actual operations +- Write a function that dispatches to actual operations ```py def do(expr): @@ -160,7 +160,7 @@

      An Interpreter

      - Store variables in a dictionary that's passed to every `do_` function - Like the dictionary returned by the `globals` function - - An environment + - An environment ```py def do_abs(env, args): @@ -219,8 +219,8 @@

      An Interpreter

      ## Everything Is An Expression -- Python distinguishes expressions that produce values - from statements that don't +- Python distinguishes expressions that produce values + from statements that don't - But it doesn't have to, and many languages don't ```python @@ -319,7 +319,7 @@

      An Interpreter

      ## How Good Is Our Design? -- One way to evaluate a design is to ask how extensible it is +- One way to evaluate a design is to ask how extensible it is - The answer for the interpreter is "pretty easily" - The answer for our little language is "not at all" - We need a way to define and call functions of our own diff --git a/docs/intro/index.html b/docs/intro/index.html index 961e1eb2f..1b60d6cb7 100644 --- a/docs/intro/index.html +++ b/docs/intro/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/intro/slides/index.html b/docs/intro/slides/index.html index ae263326b..5485e424b 100644 --- a/docs/intro/slides/index.html +++ b/docs/intro/slides/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/intro/syllabus_linear.svg b/docs/intro/syllabus_linear.svg index ff852a0ce..cdb11b1ff 100644 --- a/docs/intro/syllabus_linear.svg +++ b/docs/intro/syllabus_linear.svg @@ -76,7 +76,7 @@ Archiver - + dup->archive @@ -99,7 +99,7 @@ - + glob->archive @@ -226,7 +226,7 @@ A Database - + mock->db @@ -280,7 +280,7 @@ Page Layout - + check->layout @@ -291,7 +291,7 @@ - + template->layout @@ -336,7 +336,7 @@ - + persist->db diff --git a/docs/intro/syllabus_regular.svg b/docs/intro/syllabus_regular.svg index 8439ec938..1db86b96d 100644 --- a/docs/intro/syllabus_regular.svg +++ b/docs/intro/syllabus_regular.svg @@ -59,7 +59,7 @@ Archiver - + dup->archive @@ -84,7 +84,7 @@ - + glob->archive @@ -186,7 +186,7 @@ A Database - + mock->db @@ -230,13 +230,13 @@ Page Layout - + check->layout - + template->layout @@ -261,7 +261,7 @@ - + persist->db diff --git a/docs/layout/index.html b/docs/layout/index.html index 9dfa42cf6..0f05cdb8c 100644 --- a/docs/layout/index.html +++ b/docs/layout/index.html @@ -4,7 +4,7 @@ - + @@ -364,7 +364,7 @@

      Chapter 14: Page Layout

      as an e-book (which is basically the same thing), or on the printed page. In all three cases -a layout engine took some text and some layout instructions +a layout engine took some text and some layout instructions and decided where to put each character and image. To explore how they work, we will build a small layout engine @@ -393,10 +393,10 @@

      Upside Down

      Section 14.2: Sizing

      -

      Let’s start on easy mode +

      Let’s start on easy mode without margins, padding, line-wrapping, or other complications. Everything we can put on the screen is represented as a rectangular cell, -and every cell is either a row, a column, or a block. +and every cell is either a row, a column, or a block. A block has a fixed width and height:

      class Block:
      @@ -683,7 +683,7 @@ 

      Section 14.4: Rendering

      child blocks will overwrite the markings made by their parents, which will automatically produce the right appearance (Figure 14.4). -(A more sophisticated version of this called z-buffering +(A more sophisticated version of this called z-buffering keeps track of the visual depth of each pixel in order to draw things in three dimensions.)

      @@ -722,7 +722,7 @@

      Section 14.4: Rendering

      and give the new class a render method with the same signature. Since Python supports multiple inheritance, -we can do this with a mixin class +we can do this with a mixin class (Figure 14.5):

      from placed import PlacedBlock, PlacedCol, PlacedRow
      @@ -789,7 +789,7 @@ 

      (Not) The Right Way to Do It

      is a sign that we should do more testing It would be very easy for us to get a wrong result and convince ourselves that it was actually correct; -confirmation bias of this kind +confirmation bias of this kind is very common in software development.

      Section 14.5: Wrapping

      One of the biggest differences between a browser and a printed page @@ -922,10 +922,10 @@

      Section 14.5: Wrapping

      Note that we could have had columns handle resizing rather than rows, but we (probably) don’t need to make both resizeable. -This is an example of intrinsic complexity: +This is an example of intrinsic complexity: the problem really is this hard, so something, somewhere, has to deal with it. -(Programs often contain accidental complexity +(Programs often contain accidental complexity as well, which can be fixed if people are willing to accept that it is unnecessary and are willing to change. @@ -934,7 +934,7 @@

      Section 14.5: Wrapping

      The Liskov Substitution Principle

      We are able to re-use tests as our code evolved -because of the Liskov Substitution Principle, +because of the Liskov Substitution Principle, which states that it should be possible to replace objects in a program with objects of derived classes diff --git a/docs/layout/slides/index.html b/docs/layout/slides/index.html index d5f2d7dcd..4fc7deacd 100644 --- a/docs/layout/slides/index.html +++ b/docs/layout/slides/index.html @@ -4,7 +4,7 @@ - + @@ -50,7 +50,7 @@

      Page Layout

      - How can we put the right things in the right places? -- Create a simple version of a layout engine for a browser +- Create a simple version of a layout engine for a browser - But the same ideas apply to print @@ -220,7 +220,7 @@

      Page Layout

      - Draw parents before children so that children over-draw -- A simple form of z-buffering +- A simple form of z-buffering
      Children drawing over their parents diff --git a/docs/license/index.html b/docs/license/index.html index a7c4efd0b..16b4690d9 100644 --- a/docs/license/index.html +++ b/docs/license/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/lint/index.html b/docs/lint/index.html index b61dbb15e..599a8c068 100644 --- a/docs/lint/index.html +++ b/docs/lint/index.html @@ -4,7 +4,7 @@ - + @@ -367,7 +367,7 @@

      Chapter 13: A Code Linter

      that classes and functions have consistent names, that modules are imported in a consistent order, and dozens of other things.

      -

      Checking tools are often called linters, +

      Checking tools are often called linters, because an early tool like this that found fluff in C programs was called lint, and the name stuck. Many projects insist that code pass checks like these @@ -619,7 +619,7 @@

      As Far as We Can Go

      We could try adding logic to handle this, but one of the fundamental theorems of computer science is that it’s impossible to create a program that can predict the output of arbitrary other programs. -Our linter can therefore produce false negatives, +Our linter can therefore produce false negatives, i.e., tell us there aren’t problems when there actually are.

      @@ -742,7 +742,7 @@

      Section 13.4: Extension

      if several people each create their own NodeVisitor with a visit_Name method, we’d have to inherit from all those classes and then have the new class’s visit_Name call up to all of its parents’ equivalent methods.

      -

      One way around this is to inject methods into classes +

      One way around this is to inject methods into classes after they have been defined. The code fragment below creates a new class called BlankNodeVisitor that doesn’t add anything to NodeVisitor, diff --git a/docs/lint/slides/index.html b/docs/lint/slides/index.html index 8284b1ae5..ffb3f9aab 100644 --- a/docs/lint/slides/index.html +++ b/docs/lint/slides/index.html @@ -4,7 +4,7 @@ - + @@ -50,7 +50,7 @@

      A Code Linter

      - And doesn't do things that are likely to be bugs -- Build a linters +- Build a linters - Checks for "fluff" in code @@ -58,9 +58,9 @@

      A Code Linter

      ## Programs as Trees -- Chapter 11 represented HTML as a DOM tree +- Chapter 11 represented HTML as a DOM tree -- We can represent code as an abstract syntax tree (AST) +- We can represent code as an abstract syntax tree (AST) - Each node represents a syntactic element in the program @@ -121,7 +121,7 @@

      A Code Linter

      so we would have to write a recursive function for each type of node - [`ast`][py_ast] module's `ast.NodeVisitor` implements - the Visitor pattern + the Visitor pattern - Each time it reaches a node of type `Thing`, it looks for a method `visit_Thing` @@ -271,7 +271,7 @@

      A Code Linter

      - Not wrong, but clutter makes code harder to read -- Have to take scope into account +- Have to take scope into account - So keep a stack of scopes @@ -354,7 +354,7 @@

      A Code Linter

      - What if several people write `visit_FuncDef`? -- Use method injection +- Use method injection to add methods to classes after they have been defined --- @@ -387,7 +387,7 @@

      A Code Linter

      ``` -- But still have the problem of name collision +- But still have the problem of name collision --- diff --git a/docs/mock/index.html b/docs/mock/index.html index b849fb6e2..77da19e8d 100644 --- a/docs/mock/index.html +++ b/docs/mock/index.html @@ -4,7 +4,7 @@ - + @@ -363,7 +363,7 @@

      Chapter 9: Mocks, Protocols, and Decorators

      but we have reached a point where we have to learn a little more about Python in order to proceed. The same thing happened in [Wilson2022b], -which had to explain how promises work +which had to explain how promises work to make later examples approachable; as in that book, the explanations below refer back to what we’ve already seen @@ -391,7 +391,7 @@

      Section 9.1: Mock Objects

      assert elapsed(50) == 150
      -

      Temporary replacements like this are called mock objects +

      Temporary replacements like this are called mock objects because we usually use objects even if the thing we’re replacing is a function. We can do this because Python lets us create objects that can be “called” just like functions. @@ -516,7 +516,7 @@

      Section 9.2: Protocols

      but people are forgetful. It would be better if Python did this automatically; luckily for us, -it provides a protocol for exactly this purpose. +it provides a protocol for exactly this purpose. A protocol is a rule that specifies how programs can tell Python to do specific things at specific moments. Giving a class a __call__ method is an example of this: @@ -526,7 +526,7 @@

      Section 9.2: Protocols

      if a class has a method with that name, Python calls it automatically when constructing a new instance of that class.

      What we want for managing mock objects is -a context manager +a context manager that replaces the real function with our mock at the start of a block of code and then puts the original back at the end. The protocol for this relies on two methods called __enter__ and __exit__. @@ -586,7 +586,7 @@

      Section 9.3: Decorators

      Python programs rely on several other protocols, each of which gives user-level code a way to interact with some aspect of the Python interpreter. -One of the most widely used is called a decorator, +One of the most widely used is called a decorator, which allows us to wrap one function with another.

      In order to understand how decorators work, we must take another look at closures, @@ -754,7 +754,7 @@

      Section 9.4: Iterators

      A statement like for thing in collection assigns each item in collection to the variable thing one at a time, possibly in a predetermined order. -Python implements this using a two-part iterator protocol:

      +Python implements this using a two-part iterator protocol:

      1. If an object has an __iter__ method, diff --git a/docs/mock/slides/index.html b/docs/mock/slides/index.html index 3dee47b48..1ac40ca94 100644 --- a/docs/mock/slides/index.html +++ b/docs/mock/slides/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/oop/index.html b/docs/oop/index.html index f0de09436..0016fcd5b 100644 --- a/docs/oop/index.html +++ b/docs/oop/index.html @@ -4,7 +4,7 @@ - + @@ -388,7 +388,7 @@

        Section 2.1: Objects

        A specification like this is sometimes called -a contract +a contract because any particular shape must conform to it:

        class Square(Shape):
        @@ -427,7 +427,7 @@ 

        Section 2.1: Objects

        print(f"{n} is a {c} {p:.2f} {p:.2f}")
        -

        This is called polymorphism. +

        This is called polymorphism. It reduces cognitive load by allowing the people using a set of related things (in this case, objects) @@ -546,7 +546,7 @@

        Section 2.2: Classes

        Variable Arguments

        Like most modern programming languages, Python allows us to define functions that take a variable number of arguments, -and to call functions by spreading a list or dictionary:

        +and to call functions by spreading a list or dictionary:

        def show_args(title, *args, **kwargs):
             print(f"{title} args '{args}' and kwargs '{kwargs}'")
        @@ -662,7 +662,7 @@ 

        Section 2.3: Inheritance

        We do have one task left, though: we need to make sure that when a square or circle is made, it is made correctly. -In short, we need to implement constructors. +In short, we need to implement constructors. We do this by giving the dictionaries that implements classes a special key _new whose value is the function that builds something of that type:

        @@ -723,10 +723,10 @@

        Section 2.3: Inheritance

        Section 2.4: Summary

        We have only scratched the surface of what Python’s object system provides. -Multiple inheritance, -class methods, -static methods, -and monkey patching are all useful, +Multiple inheritance, +class methods, +static methods, +and monkey patching are all useful, but all can be understood in terms of dictionaries that contain references to properties, functions, and other dictionaries.

        @@ -756,7 +756,7 @@

        Reporting Type

        Method Caching

        Our implementation searches for the implementation of a method every time that method is called. -An alternative is to add a cache to each object +An alternative is to add a cache to each object to save the methods that have been looked up before. For example, each object could have a special key called _cache whose value is a dictionary. diff --git a/docs/oop/slides/index.html b/docs/oop/slides/index.html index 1875f1405..c0101469d 100644 --- a/docs/oop/slides/index.html +++ b/docs/oop/slides/index.html @@ -4,7 +4,7 @@ - + @@ -56,7 +56,7 @@

        Objects and Classes

        ## Representing Shapes -- Start with the contract for shapes +- Start with the contract for shapes ```py class Shape: @@ -418,13 +418,13 @@

        Objects and Classes

        ## The Other 90% -- Multiple inheritance +- Multiple inheritance -- Class methods +- Class methods -- Static methods +- Static methods -- Monkey patching +- Monkey patching [academic_prototyping]: https://www.fuzzingbook.org/html/AcademicPrototyping.html diff --git a/docs/pack/index.html b/docs/pack/index.html index cd14275bf..6fc76cd65 100644 --- a/docs/pack/index.html +++ b/docs/pack/index.html @@ -4,7 +4,7 @@ - + @@ -391,17 +391,17 @@

        Chapter 20: A Package Manager

        Michael Reim’s history of Unix packaging.

        Section 20.1: Semantic Versioning

        Most software projects use -semantic versioning +semantic versioning for software releases. Each version is three integers X.Y.Z, where X is the major version, Y is the minor version, -and Z is the patch. +and Z is the patch. (The full specification allows for more fields, but we will ignore them in this tutorial.)

        A package’s authors increment its major version number when a change to the package breaks -backward compatibility. +backward compatibility. For example, if the new version adds a required parameter to a function, then code built for the old version will fail or behave unpredictably with the new one. @@ -464,11 +464,11 @@

        Comments

        How much work is it to check all of these possibilities? Our example has \( 3×3×2=18 \) combinations. If we were to add another package to the mix with 2 versions, -the search space would double; +the search space would double; add another, and it would double again. This behavior is called -combinatorial explosion, +combinatorial explosion, and it makes brute force solutions impractical even for small problems. We will implement it as a starting point (and to give us something to test more complicated solutions against), @@ -505,7 +505,7 @@

        Reproducibility

        To generate the possibilities, we create a list of the available versions of each package, then use Python’s itertools module -to generate the cross product +to generate the cross product that contains all possible combinations of items (Figure 20.2):

        @@ -593,7 +593,7 @@

        Section 20.3: Generating Possibilities Manually

        The first half creates the same list of lists as before, where each sub-list is the available versions of a single package. -It then creates an empty accumulator +It then creates an empty accumulator to collect all the combinations and calls a recursive function called _make_possible to fill it in.

        Each call to _make_possible handles one package’s worth of work @@ -635,7 +635,7 @@

        Section 20.3: Generating Possibilities Manually

        but if there were four, we would need a quadruply-nested loop, and so on. -This Recursive Enumeration design pattern +This Recursive Enumeration design pattern uses one recursive function call per loop so that we automatically get exactly as many loops as we need.

        Section 20.4: Incremental Search

        @@ -770,7 +770,7 @@

        Section 20.5: Using a Theorem Prover

        and that’s exactly what we need. To start, let’s import a few things from the z3 module -and then create three Boolean variables:

        +and then create three Boolean variables:

        from z3 import Bool, Implies, Not, Solver, sat, unsat
         
        @@ -789,7 +789,7 @@ 

        Section 20.5: Using a Theorem Prover

        Instead of assigning values to A, B, and C, we can specify constraints on them, then ask z3 whether it’s possible to find a set of values, -or model, +or model, that satisfies all those constraints at once. In the example below, we’re asking whether it’s possible for A to equal B @@ -1027,7 +1027,7 @@

        Parsing Semantic Versions

        Using Scoring Functions

        Many different combinations of package versions can be mutually compatible. One way to decide which actual combination to install -is to create a scoring function +is to create a scoring function that measures how good or bad a particular combination is. For example, a function could measure the “distance” between two versions as:

        diff --git a/docs/pack/slides/index.html b/docs/pack/slides/index.html index 91580c919..55d31acbf 100644 --- a/docs/pack/slides/index.html +++ b/docs/pack/slides/index.html @@ -4,7 +4,7 @@ - + @@ -57,13 +57,13 @@

        A Package Manager

        ## Identifying Versions -- Semantic versioning uses three integers `X.Y.Z` +- Semantic versioning uses three integers `X.Y.Z` - `X` is the major version (breaking changes) - `Y` is the minor version (new features) - - `Z` is the patch (bug fixes) + - `Z` is the patch (bug fixes) - Notation @@ -119,9 +119,9 @@

        A Package Manager

        - Our example has 3×3×2=18 combinations - Adding one more package with two versions doubles - the search space + the search space -- A combinatorial explosion +- A combinatorial explosion - Brute force solutions are impractical even for small problems @@ -153,7 +153,7 @@

        A Package Manager

        - Create a list of the available versions of each package -- Generate their cross product +- Generate their cross product ```py def make_possibilities(manifest): @@ -277,7 +277,7 @@

        A Package Manager

        - Use recursion instead of nested loops because we don't know how many loops to write -- The Recursive Enumeration design pattern +- The Recursive Enumeration design pattern --- @@ -381,7 +381,7 @@

        A Package Manager

        ## Using a Theorem Prover -- An automated theorem prover can do much better +- An automated theorem prover can do much better - But the algorithms quickly become very complex @@ -422,7 +422,7 @@

        A Package Manager

        ``` -- Then ask Z3 to find a model that satisfies those constraints +- Then ask Z3 to find a model that satisfies those constraints ``` A == B & B == C: sat diff --git a/docs/parse/index.html b/docs/parse/index.html index 27d5d7dad..39c80a2cb 100644 --- a/docs/parse/index.html +++ b/docs/parse/index.html @@ -4,7 +4,7 @@ - + @@ -359,12 +359,12 @@

        Chapter 5: Parsing Text

        is a lot easier to read and write than Lit("2023-", Any(Either("pdf", "txt"))). If we want to use the former, -we need a parser +we need a parser to convert those strings to objects.

        Most parsers are written in two parts (Figure 5.1). -The first groups characters into atoms of text called “tokens“. +The first groups characters into atoms of text called “tokens“. The second assembles those tokens to create -an abstract syntax tree (AST).

        +an abstract syntax tree (AST).

        Parsing pipeline
        Figure 5.1: Stages in parsing pipeline.
        @@ -408,7 +408,7 @@

        Please Don’t Write Parsers

        so we need parsers to translate the former into the latter. However, the world doesn’t need more file formats: -please use CSV, JSON, YAML, +please use CSV, JSON, YAML, or something else that already has an acronym rather than inventing something of your own.

        @@ -420,7 +420,7 @@

        Section 5.2: Tokenizing

        This classification guides the design of our parser:

        1. -

          If it is a literal then +

          If it is a for the string containing those three letters." markdown="1">literal then combine it with the current literal (if there is one) or start a new literal (if there isn’t).

        2. @@ -544,10 +544,10 @@

          Section 5.3: Parsing

          Introspection and Dispatch

          Having a program look up a function or method inside itself while it is running -is an example of introspection. +is an example of introspection. Using this to decide what to do next rather than having a long chain of if statements -is often called dynamic dispatch, +is often called or some other property name." markdown="1">dynamic dispatch, since the code doing the lookup (in this case, the _parse method) decides who to give work to on the fly. @@ -634,7 +634,7 @@

          Introspection and Dispatch

        They’re Just Methods

        -

        Operator overloading +

        Operator overloading relies on the fact that when Python sees a == b it calls a.__eq__(b). Similarly, a + b is “just” a called to a.__add__(b), and so on, @@ -647,7 +647,7 @@

        They’re Just Methods

        The parent Match class performs the checks that all classes need to perform (in this case, that the objects being compared have the same -concrete class). +concrete class). If the child class needs to do any more checking (for example, that the characters in two Lit objects are the same) it calls up to the parent method first, diff --git a/docs/parse/slides/index.html b/docs/parse/slides/index.html index 1a6b40636..9b14537ac 100644 --- a/docs/parse/slides/index.html +++ b/docs/parse/slides/index.html @@ -4,7 +4,7 @@ - + @@ -53,9 +53,9 @@

        Parsing Text

        -- -1. Group characters into tokens +1. Group characters into tokens -2. Use tokens to create an abstract syntax tree +2. Use tokens to create an abstract syntax tree
        Parsing pipeline @@ -203,11 +203,11 @@

        Parsing Text

        ## Introspection and Dispatch -- Introspection: +- Introspection: having a program look up a function or method inside itself while it is running -- Dynamic dispatch: +- Dynamic dispatch: using introspection to decide what to do next rather than a long chain of `if` statements @@ -296,12 +296,12 @@

        Parsing Text

        ## They're Just Methods - `a == b` is "just" `a.__eq__(b)` -- Operator overloading: +- Operator overloading: if our class has methods with the right names, Python calls them to perform "built-in" operations - Parent `Match` class does shared work - E.g., make sure objects have - the same concrete class + the same concrete class - Child method (if any) does details - E.g., make sure two `Lit` objects are checking for the same text diff --git a/docs/perf/index.html b/docs/perf/index.html index 727d6c4e9..20e2968dd 100644 --- a/docs/perf/index.html +++ b/docs/perf/index.html @@ -4,7 +4,7 @@ - + @@ -364,7 +364,7 @@

        Chapter 15: Performance Profiling

        Whether we use Excel, SQL, R, or Python, we will almost certainly be using tables that have named columns and multiple rows. -Tables of this kind are called dataframes, +Tables of this kind are called dataframes, and to explore how we should implement them, this chapter builds them two different ways and then compares their performance.

        @@ -413,13 +413,13 @@

        Docstrings Are Enough

        Every method in Python needs a body, so many programmers will write pass (Python’s “do nothing” statement). However, -a docstring also counts as a body, +a docstring also counts as a body, so if we write those (which we should) there’s no need to write pass.

        For our first usable implementation, we will derive a class DfRow that uses -row-wise storage +row-wise storage (Figure 15.1). The dataframe is stored as a list of dictionaries, each of which represents a row. @@ -433,7 +433,7 @@

        Docstrings Are Enough

        Our second implementation, DfCol, -will use column-wise storage +will use column-wise storage (Figure 15.2). Each column is stored as a list of values, all of which are of the same type. @@ -597,7 +597,7 @@

        Section 15.2: Row-Wise Storage

        Notice that the dataframe created by filter re-uses the rows of the original dataframe (Figure 15.4). -This is safe and efficient so long as columns are immutable, +This is safe and efficient so long as columns are immutable, i.e., so long as their contents are never changed in place.

        @@ -788,17 +788,17 @@

        Section 15.4: Performance

        Transactions vs. Analysis

        Regardless of data volumes, different storage schemes are better (or worse) for different kinds of work. -Online transaction processing (OLTP) +Online transaction processing (OLTP) refers to adding or querying individual records, such as online sales. -online analytical processing (OLAP), +online analytical processing (OLAP), on the other hand, processes selected columns of a table in bulk to do things like find averages over time. Row-wise storage is usually best for OLTP, but column-wise storage is better suited for OLAP. If data volumes are large, -data engineers will sometimes run two databases in parallel, -using batch processing jobs +data engineers will sometimes run two databases in parallel, +using batch processing jobs to copy new or updated records from the OLTP databases over to the OLAP database.

        To compare the speed of these classes, @@ -807,7 +807,7 @@

        Transactions vs. Analysis

        To keep things simple we will create dataframes whose columns are called label_1, label_2, and so on, and whose values are all integers in the range 0–9. -A thorough set of benchmarks +A thorough set of benchmarks would create columns of other kinds as well, but this is enough to illustrate the technique.

        @@ -890,7 +890,7 @@

        Transactions vs. Analysis

        This function is called sweep because executing code multiple times with different parameters to measure performance -is called parameter sweeping.

        +is called parameter sweeping.

        The results are shown in Table 15.1 and Figure 15.7. For a 1000 by 1000 dataframe selection is over 250 times faster with column-wise storage than with row-wise, @@ -947,7 +947,7 @@

        Transactions vs. Analysis

        Figure 15.7: Relative performance of row-wise and column-wise storage
        -

        We can get much more insight by profiling our code +

        We can get much more insight by profiling our code using Python cProfile module. This tool runs a program for us, collects detailed information on how long functions ran, @@ -1160,9 +1160,9 @@

        Join Performance

        when joining two tables based on matching numeric keys. Does the answer depend on the fraction of keys that match?

        Join Optimization

        -

        The simplest way to join two tables is +

        The simplest way to join two tables is to look for matching keys using a double loop. -An alternative is to build an index for each table +An alternative is to build an index for each table and then use it to construct matches. For example, suppose the tables are:

    diff --git a/docs/perf/slides/index.html b/docs/perf/slides/index.html index f59750ac7..5a77486cc 100644 --- a/docs/perf/slides/index.html +++ b/docs/perf/slides/index.html @@ -4,7 +4,7 @@ - + @@ -211,7 +211,7 @@

    Performance Profiling

    ## How to Call? -- Use `**` to spread the row +- Use `**` to spread the row ```py def example(left, right): @@ -442,7 +442,7 @@

    Performance Profiling

    ## How Can We Improve Performance? -- A profiler measures where a program spends its time +- A profiler measures where a program spends its time - More accurate than our `time.time()` calls diff --git a/docs/persist/index.html b/docs/persist/index.html index e1bac77ac..51c8f7ca5 100644 --- a/docs/persist/index.html +++ b/docs/persist/index.html @@ -4,7 +4,7 @@ - + @@ -370,11 +370,11 @@

    Chapter 16: Object Persistence

    rather than flattering it into rows and columns. Python’s pickle module does this in a Python-specific way, while the json module saves some kinds of objects as text -formatted as JSON, +formatted as JSON, which program written in other languages can read.

    The phrase “some kinds of objects” is the most important part of the preceding paragraph. Since programs can define new classes, -a persistence framework +a persistence framework has to choose one of the following:

    1. @@ -415,7 +415,7 @@

      Chapter 16: Object Persistence

      we will look at non-text options in Chapter 17.

      Section 16.1: Built-in Types

      The first thing we need to do is specify our data format. -We will store each atomic value on a line of its own +We will store each atomic value on a line of its own with the type’s name first and the value second like this:

      bool:True
      @@ -491,7 +491,7 @@ 

      Section 16.1: Built-in Types

      Saving a list is almost as easy: we save the number of items in the list, -and then save each item with a recursive called to save. +and then save each item with a recursive called to save. For example, the list [55, True, 2.71] is saved as shown in Figure 16.1.

      @@ -596,7 +596,7 @@

      Section 16.2: Converting to Classes

      but as we were extending them we had to modify their internals every time we wanted to do something new.

      -

      The Open-Closed Principle states that +

      The Open-Closed Principle states that software should be open for extension but closed for modification, i.e., that it should be possible to extend functionality without having to rewrite existing code. @@ -689,7 +689,7 @@

      Section 16.3: Aliasing

      Figure 16.2: Saving aliased data without respecting aliases.
      -

      The problem is that the list shared is aliased, +

      The problem is that the list shared is aliased, i.e., there are two or more references to it. To reconstruct the original data correctly we need to:

      @@ -899,7 +899,7 @@

      Section 16.3: Aliasing

      and then loads list items recursively. We have to pass it the ID of the list to use as the key in seen, -and we have to use a loop rather than a list comprehension, +and we have to use a loop rather than a list comprehension, but the changes to _set and _dict follow exactly the same pattern.

      word = "word"
      @@ -969,18 +969,18 @@ 

      Section 16.4: User-Defined Classes

      Require it to inherit from a base class that we provide so that we can use isinstance to check if an object is persistable. This approach is used in strictly-typed languages like Java, - but method #2 below is considered more Pythonic.

      + but method #2 below is considered more Pythonic.

    2. Require it to implement a method with a specific name and signature without deriving from a particular base class. - This approach is called duck typing: + This approach is called duck typing: if it walks like a duck and quacks like a duck, it’s a duck. Since option #1 would require users to write this method anyway, it’s the one we’ll choose.

    3. -

      Require users to register a helper class +

      Require users to register a helper class that knows how to save and load objects of the class we’re interested in. This approach is also commonly used in strictly-typed languages as a way of adding persistence after the fact diff --git a/docs/persist/slides/index.html b/docs/persist/slides/index.html index 8fb9cc8bf..2935fce3f 100644 --- a/docs/persist/slides/index.html +++ b/docs/persist/slides/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/slides/index.html b/docs/slides/index.html index a9d691d61..b1c24d912 100644 --- a/docs/slides/index.html +++ b/docs/slides/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/syllabus/index.html b/docs/syllabus/index.html index 6cffaaea3..802293e3a 100644 --- a/docs/syllabus/index.html +++ b/docs/syllabus/index.html @@ -4,7 +4,7 @@ - + diff --git a/docs/template/index.html b/docs/template/index.html index 474e50644..006eec262 100644 --- a/docs/template/index.html +++ b/docs/template/index.html @@ -4,7 +4,7 @@ - + @@ -365,7 +365,7 @@

      Chapter 12: A Template Expander

      Writing and updating pages by hand is time-consuming and error-prone, particularly when many parts are the same, so most documentation sites use some kind of -static site generator +static site generator to create web pages from templates.

      At the heart of every static site generator is a page templating system. Thousands of these have been written in the last thirty years @@ -460,7 +460,7 @@

      Human-Readable vs. Machine-Readable

      but by the time we’re finished processing our templates, there shouldn’t be any z-* attributes left to confuse a browser.

      -

      The next step is to define the Application Programming Interface (API) +

      The next step is to define the Application Programming Interface (API) for filling in templates, which is just a fancy way of saying that we need to specify what function or functions a program calls @@ -546,9 +546,9 @@

      Section 12.3: Visiting Nodes

      raise NotImplementedError("close")
      -

      Visitor defines two abstract methods open and close +

      Visitor defines two abstract methods open and close that are called when we first arrive at a node and when we are finished with it. -We cannot use Visitor itself—it is an abstract class. +We cannot use Visitor itself—it is an abstract class. Instead, we must derive a class from Visitor that defines these two methods. This approach is different from that of the visitor in Chapter 11, @@ -847,9 +847,9 @@

      Section 12.4: Implementing Handlers

      Section 12.5: Control Flow

      Our tool supports conditional expressions and loops. -Since it doesn’t handle Boolean expressions like and and or, +Since it doesn’t handle Boolean expressions like and and or, implementing a conditional is as simple as looking up a variable -and then expanding the node if Python thinks the value is truthy:

      +and then expanding the node if Python thinks the value is truthy:

      """Conditionalize part of a template."""
       
      diff --git a/docs/template/slides/index.html b/docs/template/slides/index.html
      index 6b01313fa..92e93cbcc 100644
      --- a/docs/template/slides/index.html
      +++ b/docs/template/slides/index.html
      @@ -4,7 +4,7 @@
         
         
         
      -  
      +  
         
         
         
      @@ -119,7 +119,7 @@ 

      A Template Expander

      ## How Do We Call This? -- Design the API of our library first +- Design the API of our library first ```py data = {"names": ["Johnson", "Vaughan", "Jackson"]} @@ -162,7 +162,7 @@

      A Template Expander

      ## Visiting Nodes -- Use the Visitor design pattern +- Use the Visitor design pattern ```py class Visitor: diff --git a/docs/test/index.html b/docs/test/index.html index b8e1e296d..4a661b0eb 100644 --- a/docs/test/index.html +++ b/docs/test/index.html @@ -4,7 +4,7 @@ - + @@ -368,7 +368,7 @@

      Chapter 6: Running Tests

      is there to make the other 2% always does the right thing.

      We’re going to write a lot of programs in this book. To make sure they work correctly, -we’re also going to write a lot of unit tests +we’re also going to write a lot of unit tests [Meszaros2007, Aniche2022]. To make those tests easier to write (so that we actually write them) we use a unit testing framework that finds and run tests automatically. @@ -412,7 +412,7 @@

      Section 6.1: Storing and Running Tests

      the connection between the function and the original name.

      Checking Types

      -

      Python is a dynamically typed language, +

      Python is a dynamically typed language, which means that it checks the types of values as the program is running. We can do this ourselves using its built-in type function, which will tell us that 3 is an integer:

      @@ -556,23 +556,23 @@

      Checking Types

      ]
      -

      Each test does something to a fixture +

      Each test does something to a fixture (such as the number -3) -and uses assertions -to compare the actual result -against the expected result. +and uses assertions +to compare the actual result +against the expected result. The outcome of each test is exactly one of:

      • -

        Pass: +

        Pass: the test subject works as expected.

      • -

        Fail: +

        Fail: something is wrong with the test subject.

      • -

        Error: +

        Error: something is wrong in the test itself, which means we don’t know if the thing we’re testing is working properly or not.

        @@ -581,7 +581,7 @@

        Checking Types

        To implement this classification scheme we need to distinguish failing tests from broken ones. Our rule is that -if a test throws an AssertionError +if a test throws an AssertionError then one of our checks is reporting a failure, while any other kind of exception indicates that the test contains an error.

        Translating that rules into code gives us the function run_tests @@ -685,7 +685,7 @@

        Section 6.2: Finding Functions

        Local Variables

        Another function called locals returns -all the variables defined in the current (local) scope.

        +all the variables defined in the current (local) scope.

        If function names are just variables and a program’s variables are stored in a dictionary, @@ -756,7 +756,7 @@

        Section 6.3: Origins

        and tried to make sense of the key ideas.

        The problem is that “making sense” depends on who we are. When we use a low-level language, -we incur the cognitive load +we incur the cognitive load of assembling micro-steps into something more meaningful. When we use a high-level language, on the other hand, diff --git a/docs/test/slides/index.html b/docs/test/slides/index.html index 054044689..56f8ff377 100644 --- a/docs/test/slides/index.html +++ b/docs/test/slides/index.html @@ -4,7 +4,7 @@ - + @@ -188,7 +188,7 @@

        Running Tests

        ## Signatures - We have to know how to call the functions - - They must have the same signature + - They must have the same signature ```py def zero(): @@ -215,13 +215,13 @@

        Running Tests

        ## Testing Terminology -- Apply the function we want to test to a fixture -- Compare the actual result - to the expected result +- Apply the function we want to test to a fixture +- Compare the actual result + to the expected result - Possible outcomes are: - - pass: the target function worked - - fail: the target function didn't do what we expected - - error: something went wrong with the test itself + - pass: the target function worked + - fail: the target function didn't do what we expected + - error: something went wrong with the test itself - Typically use `assert` to check results - If condition is `True`, does nothing - Otherwise, raises an `AssertionError` diff --git a/docs/undo/index.html b/docs/undo/index.html index 9e45da2d1..758a39c22 100644 --- a/docs/undo/index.html +++ b/docs/undo/index.html @@ -4,7 +4,7 @@ - + @@ -414,7 +414,7 @@

        Section 24.1: Getting Started

        GUI applications that don’t display anything -are often called headless applications. +are often called headless applications. Giving our simulated keystrokes to the screen seems odd—it would make more sense for App to have a method that gets keystrokes—but it’s the simplest way to fit everything in beside @@ -625,13 +625,13 @@

        Section 24.3: Going Backward

        And how are we going to interpret these log records? Will we need a second dispatch method with its own handlers?

        The common solution to these problems is to use -the Command design pattern. +the Command design pattern. This pattern turns verbs into nouns, i.e., each action is represented as an object with methods to go forward and backward.

        Actions are all derived from -an abstract base class +an abstract base class so that they can be used interchangeably. Our base class is:

        diff --git a/docs/undo/slides/index.html b/docs/undo/slides/index.html index a1f57a067..0c812c4cf 100644 --- a/docs/undo/slides/index.html +++ b/docs/undo/slides/index.html @@ -4,7 +4,7 @@ - + @@ -77,7 +77,7 @@

        Undo and Redo

        ## A Headless Screen -- Create a headless screen for testing +- Create a headless screen for testing - Store current state of display in rectangular grid @@ -336,7 +336,7 @@

        Undo and Redo

        ## Verbs as Nouns -- Use the Command design pattern +- Use the Command design pattern - Each action (verb) is an object (noun) with methods to go forward and backward diff --git a/docs/viewer/index.html b/docs/viewer/index.html index 628ca7f26..064fce61e 100644 --- a/docs/viewer/index.html +++ b/docs/viewer/index.html @@ -4,7 +4,7 @@ - + @@ -406,7 +406,7 @@

        Section 23.1: Curses

        print statements won’t be of use. Running this program inside a single-stepping debugger is challenging for the same reason, -so for the moment we will cheat and create a log file +so for the moment we will cheat and create a log file for the program to write to:

        LOG = None
        @@ -525,7 +525,7 @@ 

        Section 23.1: Curses

        e0123
        -

        These lines are a very (very) simple example of synthetic data, +

        These lines are a very (very) simple example of synthetic data, i.e., data that is made up for testing purposes.

        Section 23.2: Windowing

        @@ -566,7 +566,7 @@

        Section 23.2: Windowing

        We can’t create it earlier and pass it in as we do with lines because the constructor for Window needs the screen object, which doesn’t exist until curses.wrapper calls main. -This is an example of delayed construction, +This is an example of delayed construction, and is going to constrain the rest of our design.

        Nothing says we have to make our window exactly the same size as the terminal that is displaying it. @@ -623,7 +623,7 @@

        Section 23.2: Windowing

        self._screen.addstr(y, 0, line[:self._size[COL]])
        -

        We should really create an enumeration, +

        We should really create an enumeration, but a pair of constants is good enough for now.

        Section 23.3: Moving

        Our program no longer crashes when given large input to display, @@ -840,7 +840,7 @@

        Inheritance

        DispatchApp is derived from our first MainApp so that we can recycle the initialization code we wrote for the latter. To make this happen, -DispatchApp.__init__ upcalls to MainApp.__init__ +DispatchApp.__init__ upcalls to MainApp.__init__ using super().__init__. We probably wouldn’t create multiple classes in a real program, but doing this simplifies exposition when teaching. @@ -870,7 +870,7 @@

        Inheritance

        return self._lines -

        This text buffer class doesn’t do much yet, +

        This text buffer class doesn’t do much yet, but will later keep track of the viewable region. Again, we make a copy of lines rather than using the list the caller gives us @@ -906,7 +906,7 @@

        Factory Methods

        we will have to rewrite the entire method each time we change which classes we want to use for any of those three things. -Putting constructor calls in factory methods +Putting constructor calls in factory methods makes the code longer, but allows us to override them one by one. We didn’t do this when we were first writing these examples; @@ -1016,7 +1016,7 @@

        Section 23.6: Viewport

        no matter how small the window is. (We will leave horizontal scrolling as an exercise.) A full-featured editor would introduce another class, -often called a viewport, +often called a viewport, to track the currently-visible portion of the buffer. To keep things simple, we will add two member variables to the buffer diff --git a/docs/viewer/slides/index.html b/docs/viewer/slides/index.html index 1dc32999a..cda225b60 100644 --- a/docs/viewer/slides/index.html +++ b/docs/viewer/slides/index.html @@ -4,7 +4,7 @@ - + @@ -223,7 +223,7 @@

        A File Viewer

        - Can't create `Window` before calling `curses.wrapper` because it needs the screen -- Delayed construction +- Delayed construction --- @@ -269,7 +269,7 @@

        A File Viewer

        - Use `ROW` and `COL` instead of 0 and 1 (or `R` and `C`) -- Should really create an enumeration +- Should really create an enumeration ```py ROW = 0 @@ -331,7 +331,7 @@

        A File Viewer

        ``` -- spread position into `stdscr.move` +- Spread position into `stdscr.move` - Screen's `getkey` method returns names of cursor keys @@ -344,7 +344,7 @@

        A File Viewer

        - And blow up when moving off left of screen or off top with `_curses.error: wmove() returned ERR` -- Do some refactoring before fixing this problem +- Do some refactoring before fixing this problem --- @@ -401,7 +401,7 @@

        A File Viewer

        ## Dispatching Keystrokes -- Find and call key handlers via dynamic dispatch +- Find and call key handlers via dynamic dispatch ```py TRANSLATE = { @@ -456,7 +456,7 @@

        A File Viewer

        - `DispatchApp` is a child of `MainApp` so that we can recycle initialization -- `DispatchApp.__init__` upcalls to `MainApp.__init__` +- `DispatchApp.__init__` upcalls to `MainApp.__init__` - Probably wouldn't create multiple classes in a real program, but simplifies exposition when teaching @@ -521,7 +521,7 @@

        A File Viewer

        - If `setup` calls constructors to create window, buffer, and cursor, we have to rewrite it each time -- Putting constructor calls in factory methods +- Putting constructor calls in factory methods allows us to override them one by one - Could pass classes into `BufferApp` constructor… @@ -617,7 +617,7 @@

        A File Viewer

        - What if the buffer is bigger than the window? -- Need a viewport to track +- Need a viewport to track the currently-visible portion of the buffer - Do vertical here and leave horizontal for exercises diff --git a/docs/vm/index.html b/docs/vm/index.html index 0d692faed..46d145796 100644 --- a/docs/vm/index.html +++ b/docs/vm/index.html @@ -4,7 +4,7 @@ - + @@ -363,25 +363,25 @@

        Chapter 25: A Virtual Machine

        If you want to dive deeper, have a look at the game Human Resource Machine.

        Section 25.1: Architecture

        -

        Our virtual machine +

        Our virtual machine simulates a computer with three parts (Figure 25.1):

        1. -

          The instruction pointer (IP) +

          The instruction pointer (IP) holds the memory address of the next instruction to execute. It is automatically initialized to point at address 0, so that is where every program must start. (This requirement is part of our VM’s - Application Binary Interface, or ABI.)

          + Application Binary Interface, or ABI.)

        2. -

          Four registers named R0 to R3 +

          Four registers named R0 to R3 that instructions can access directly. There are no memory-to-memory operations in our VM: everything happens in or through registers.

        3. -

          256 words of memory, each of which can store a single value. +

          256 words of memory, each of which can store a single value. Both the program and its data live in this single block of memory; we chose the size 256 so that the address of each word will fit in a single byte.

        4. @@ -391,16 +391,16 @@

          Section 25.1: Architecture

          Figure 25.1: Architecture of the virtual machine.
          -

          Our processor’s instruction set +

          Our processor’s instruction set defines what it can do. Instructions are just numbers, but we will write them in a simple text format called -assembly code +assembly code that gives those number human-readable names.

          The instructions for our VM are 3 bytes long. -The op code fits in one byte, +The op code fits in one byte, and each instruction may include zero, one, or two single-byte operands. -(Instructions are often called bytecode, +(Instructions are often called bytecode, since they’re packed into bytes, but so is everything else in a computer…)

          Each operand is a register identifier, @@ -634,7 +634,7 @@

          Section 25.2: Execution

          as is jumping to a fixed address if the value in a register is zero. -This conditional jump instruction is how we implement if:

          +This conditional jump instruction is how we implement if:

          elif op == OPS["beq"]["code"]:
               if self.reg[arg0] == 0:
          @@ -644,7 +644,7 @@ 

          Section 25.2: Execution

          Section 25.3: Assembly Code

          We could write out numerical op codes by hand just as early programmers did. However, -it is much easier to use an assembler, +it is much easier to use an assembler, which is just a small compiler for a language that very closely represents actual machine instructions.

          Each command in our assembly languages matches an instruction in the VM. @@ -662,7 +662,7 @@

          Section 25.3: Assembly Code

          One thing the assembly language has that the instruction set doesn’t -is labels on addresses. +is labels on addresses. A label like loop doesn’t take up any space; instead, it tells the assembler to give the address of the next instruction a name @@ -1017,7 +1017,7 @@

          Call and Return

        Disassembling Instructions

        -

        A disassembler turns machine instructions into assembly code. +

        A disassembler turns machine instructions into assembly code. Write a disassembler for the instruction set used by our virtual machine. (Since the labels for addresses are not stored in machine instructions, disassemblers typically generate labels like @L001 and @L002.)

        diff --git a/docs/vm/slides/index.html b/docs/vm/slides/index.html index b563176e1..7093b5b0c 100644 --- a/docs/vm/slides/index.html +++ b/docs/vm/slides/index.html @@ -4,7 +4,7 @@ - + @@ -58,7 +58,7 @@

        A Virtual Machine

        ## Architecture -- Our virtual machine (VM) +- Our virtual machine (VM)
        Virtual machine architecture @@ -69,17 +69,17 @@

        A Virtual Machine

        ## Architecture -1. Instruction pointer (IP) +1. Instruction pointer (IP) holds the address of the next instruction - Automatically initialized to 0, so every program must start there -1. Instructions can access registers R0 to R3 directly +1. Instructions can access registers R0 to R3 directly - No memory-to-memory operations -1. 256 words of memory +1. 256 words of memory - Addresses fit in a single byte @@ -87,11 +87,11 @@

        A Virtual Machine

        ## What It Can Do -- Instruction set defines what the processor can do +- Instruction set defines what the processor can do - Each instruction is 3 bytes -- op code fits in one byte +- op code fits in one byte - May have zero, one, or two single-byte operands @@ -106,7 +106,7 @@

        A Virtual Machine

        - Instructions are numbers, -- Write them in assembly code for readability +- Write them in assembly code for readability
    @@ -193,7 +193,7 @@

    A Virtual Machine

    ``` -- Use bitwise operations to get bytes out of integer +- Use bitwise operations to get bytes out of integer
    Unpacking instructions @@ -246,7 +246,7 @@

    A Virtual Machine

    ``` -- Conditional jump +- Conditional jump ```py elif op == OPS["beq"]["code"]: @@ -276,7 +276,7 @@

    A Virtual Machine

    ``` -- Write an assembler to the latter into the former +- Write an assembler to the latter into the former --- @@ -284,7 +284,7 @@

    A Virtual Machine

    - Instruction set doesn't have names for addresses -- But we want labels for readability +- But we want labels for readability ```as # Count up to 3. diff --git a/info/glossary.yml b/info/glossary.yml index 28b088bb2..9785074d5 100644 --- a/info/glossary.yml +++ b/info/glossary.yml @@ -2110,6 +2110,12 @@ # S +- key: sandbox + en: + term: sandbox + def: > + A space where code can execute safely. + - key: schema en: term: schema diff --git a/lib/mccole/extensions/glossary.py b/lib/mccole/extensions/glossary.py index 9eb15b8b6..aa6866daa 100644 --- a/lib/mccole/extensions/glossary.py +++ b/lib/mccole/extensions/glossary.py @@ -1,11 +1,18 @@ """Handle glossary references and glossary.""" +import re import ark import regex import shortcodes import util +UNMARKDOWN = [ + (regex.MULTISPACE, " "), + (re.compile(r"\[(.+?)\]\(.+?\)"), lambda match: match.group(1)), +] + + @ark.events.register(ark.events.Event.INIT) def collect(): """Collect information from pages.""" @@ -34,14 +41,14 @@ def _parse(pargs, kwargs, data): def _cleanup(collected): - """Translate glossary definitions into required form.""" - glossary, lang = util.read_glossary() - glossary = {item["key"]: item[lang]["term"] for item in glossary} - result = util.make_config("definitions") + """Save glossary terms to show definitions per chapter.""" + _, _ = util.read_glossary() # to ensure load + glossary = util.get_config("glossary_by_key") + used = util.make_config("glossary_in_chapter") for slug, seen in collected.items(): - terms = [(key, glossary[key]) for key in glossary if key in seen] + terms = [(key, glossary[key]["term"]) for key in glossary if key in seen] terms.sort(key=lambda item: item[1].lower()) - result.append((slug, terms)) + used.append((slug, terms)) # ---------------------------------------------------------------------- @@ -53,12 +60,10 @@ def glossary_ref(pargs, kwargs, node): util.require( (len(pargs) == 2) and (not kwargs), f"Bad 'g' shortcode {pargs} and {kwargs}" ) - slug = pargs[0] - text = pargs[1] - - used = util.make_config("glossary") - used.add(slug) - return _format_ref(slug, text) + key, text = pargs + used = util.make_config("glossary_keys_used") + used.add(key) + return _format_ref(key, text) @shortcodes.register("glossary") @@ -73,7 +78,7 @@ def glossary(pargs, kwargs, node): except KeyError as exc: util.fail(f"Glossary entries missing key, term, or {lang}: {exc}.") - markdown = [_as_markdown(glossary, lang, entry) for entry in glossary] + markdown = [_as_markdown(entry, lang) for entry in glossary] entries = "\n\n".join(markdown) return f'
    \n{entries}\n
    ' @@ -84,7 +89,7 @@ def check(): glossary, lang = util.read_glossary() defined = {entry["key"] for entry in glossary} - if (used := util.get_config("glossary")) is None: + if (used := util.get_config("glossary_keys_used")) is None: return used |= _internal_references(glossary, lang) used |= _cross_references(glossary, lang) @@ -93,7 +98,7 @@ def check(): util.warn("unused glossary entries", defined - used) -def _as_markdown(glossary, lang, entry): +def _as_markdown(entry, lang): """Convert a single glossary entry to Markdown.""" cls = 'class="gl-key"' first = f'{entry[lang]["term"]}' @@ -105,10 +110,11 @@ def _as_markdown(glossary, lang, entry): body = regex.MULTISPACE.sub(entry[lang]["def"], " ").rstrip() - if "ref" in entry[lang]: + if "ref" in entry: + glossary = util.get_config("glossary_by_key") seealso = util.TRANSLATIONS[lang]["seealso"] try: - refs = [f"[{glossary[r]}](#{r})" for r in entry[lang]["ref"]] + refs = [f"[{glossary[r]['term']}](#{r})" for r in entry["ref"]] except KeyError as exc: util.fail(f"Unknown glossary cross-ref in {entry['key']}: {exc}") body += f"
    {seealso}: {', '.join(refs)}." @@ -125,10 +131,12 @@ def _cross_references(glossary, lang): return result -def _format_ref(slug, text): +def _format_ref(key, text): """Format a glossary reference.""" cls = 'class="gl-ref"' - return f'{text}' + href = f'href="@root/glossary/#{key}"' + tooltip = f'title="{_make_tooltip(key)}"' + return f'{text}' def _internal_references(glossary, lang): @@ -138,3 +146,14 @@ def _internal_references(glossary, lang): for match in regex.GLOSSARY_INTERNAL_REF.finditer(entry[lang]["def"]): result.add(match.group(1)) return result + + +def _make_tooltip(key): + """Make tooltip for glossary display.""" + glossary = util.get_config("glossary_by_key") + util.require(key in glossary, f"Unknown glossary key {key}") + entry = glossary[key] + text = entry["def"].strip() + for (pat, sub) in UNMARKDOWN: + text = pat.sub(sub, text) + return text diff --git a/lib/mccole/extensions/util.py b/lib/mccole/extensions/util.py index 5feaf7d98..592163ec1 100644 --- a/lib/mccole/extensions/util.py +++ b/lib/mccole/extensions/util.py @@ -17,9 +17,10 @@ # i.e., `"figures"` becomes `ark.site.config["mccole"]["figures"]`. CONFIGURATIONS = { "bibliography": set(), # citations - "definitions": [], # glossary definitions "figures": {}, # numbered figures - "glossary": set(), # glossary keys + "glossary_by_key": {}, # glossary definitions by key for current language + "glossary_in_chapter": [], # glossary definitions used per chapter + "glossary_keys_used": set(), # glossary keys seen overall "headings": {}, # number chapter, section, and appendix headings "inclusions": {}, # included files "index": {}, # index entries @@ -51,6 +52,7 @@ # Cached values. CACHE = { "glossary": None, + "glossary_by_key": None, "links": None, "links_table": None, "major": None, @@ -212,6 +214,7 @@ def read_glossary(): assert lang in entry, f"Bad glossary entry {entry}" assert "def" in entry[lang], f"Bad glossary entry {entry}" CACHE["glossary"] = glossary + make_config("glossary_by_key", {entry["key"]:entry[lang] for entry in glossary}) return CACHE["glossary"], lang diff --git a/lib/mccole/templates/definitions.html b/lib/mccole/templates/definitions.html index 0d4350311..cf69909d8 100644 --- a/lib/mccole/templates/definitions.html +++ b/lib/mccole/templates/definitions.html @@ -1,4 +1,4 @@ -{% for node_slug, terms in site.mccole.definitions %}{% if node_slug == node.slug and terms %} +{% for node_slug, terms in site.mccole.glossary_in_chapter %}{% if node_slug == node.slug and terms %}

    Terms defined: {% for gl_slug, term in terms %}{{term}}{% if not loop.is_last %}, {% endif %}{% endfor %}

    diff --git a/src/binary/slides.html b/src/binary/slides.html index 5da560131..eaf17ba80 100644 --- a/src/binary/slides.html +++ b/src/binary/slides.html @@ -28,7 +28,7 @@ - More common to use [%g hexadecimal "hexadecimal" %] (base 16) - Digits are 0123456789ABCDEF -- Each digit represents 4 bits (a [%g nybble "nybble" %]) +- Each digit represents 4 bits (half a byte) [% inc pat="hex_notation.*" fill="py out" %] @@ -36,7 +36,7 @@ ## Negative Numbers -- Could use [%g sign_and_magnitude "sign and magnitude" %] +- Could use [%g sign_magnitude "sign and magnitude" %] - `0100` is 4 - `1100` is -4 - But: diff --git a/src/build/slides.html b/src/build/slides.html index 377e4ce45..5461062cf 100644 --- a/src/build/slides.html +++ b/src/build/slides.html @@ -24,7 +24,7 @@ - Many others now exist (e.g., [Snakemake][snakemake]) -- If a [%g target "target" %] is [%g build_stale "stale" %] +- If a [%g build_target "target" %] is [%g build_stale "stale" %] with respect to any of its [%g dependency "dependencies" %], run a [%g build_recipe "recipe" %] to refresh it diff --git a/src/db/slides.html b/src/db/slides.html index ea19603b4..5d550abed 100644 --- a/src/db/slides.html +++ b/src/db/slides.html @@ -10,7 +10,7 @@ - And interoperability across languages -- Create a simple [%g log_structured_database "log-structured database" %] +- Create a simple [%g log_structured_db "log-structured database" %] --- @@ -140,7 +140,7 @@ ## Saving Blocks -- Save *N* records per [%g block "block" %] +- Save *N* records per [%g block_memory "block" %] - Keep the [%g index_database "index" %] in memory diff --git a/src/dup/slides.html b/src/dup/slides.html index ed29c5147..e9183289a 100644 --- a/src/dup/slides.html +++ b/src/dup/slides.html @@ -203,7 +203,7 @@ - There's a 50% chance of two people sharing a birthday in a group of 23 people and a 99.9% chance with 70 people -- How many files do we need to hash before there's a 50% chance of a [%g collision "collision" %] +- How many files do we need to hash before there's a 50% chance of a [%g hash_collision "collision" %] with a 256-bit hash? - Answer is "approximately \\( 4{\times}10^{38} \\) files" diff --git a/src/ftp/slides.html b/src/ftp/slides.html index 5e9f5821e..8dd433c1c 100644 --- a/src/ftp/slides.html +++ b/src/ftp/slides.html @@ -100,7 +100,7 @@ - What if client sends more data than that? -- Allocating a larger [%g buffer "buffer" %] just delays the problem +- Allocating a larger [%g buffer_memory "buffer" %] just delays the problem - Better solution: keep reading until there is no more data diff --git a/src/http/slides.html b/src/http/slides.html index 4a7e99918..17416348b 100644 --- a/src/http/slides.html +++ b/src/http/slides.html @@ -96,7 +96,7 @@ - Browser shows page -- Shell shows [%g log_message "log messages" %] +- Shell shows log messages [% inc file="basic_server.out" %] diff --git a/src/interp/slides.html b/src/interp/slides.html index 2097caec7..71e0ed0dd 100644 --- a/src/interp/slides.html +++ b/src/interp/slides.html @@ -48,7 +48,7 @@ ## Dispatching Operations -- Write a function that [%g dispatch "dispatches" %] to actual operations +- Write a function that [%g dynamic_dispatch "dispatches" %] to actual operations [% inc file="expr.py" keep="do" %] @@ -155,7 +155,7 @@ ## How Good Is Our Design? -- One way to evaluate a design is to ask how [%g extensible "extensible" %] it is +- One way to evaluate a design is to ask how [%g extensibility "extensible" %] it is - The answer for the interpreter is "pretty easily" - The answer for our little language is "not at all" - We need a way to define and call functions of our own diff --git a/src/intro/syllabus_linear.svg b/src/intro/syllabus_linear.svg index ff852a0ce..cdb11b1ff 100644 --- a/src/intro/syllabus_linear.svg +++ b/src/intro/syllabus_linear.svg @@ -76,7 +76,7 @@ Archiver - + dup->archive @@ -99,7 +99,7 @@ - + glob->archive @@ -226,7 +226,7 @@ A Database - + mock->db @@ -280,7 +280,7 @@ Page Layout - + check->layout @@ -291,7 +291,7 @@ - + template->layout @@ -336,7 +336,7 @@ - + persist->db diff --git a/src/intro/syllabus_regular.svg b/src/intro/syllabus_regular.svg index 8439ec938..1db86b96d 100644 --- a/src/intro/syllabus_regular.svg +++ b/src/intro/syllabus_regular.svg @@ -59,7 +59,7 @@ Archiver - + dup->archive @@ -84,7 +84,7 @@ - + glob->archive @@ -186,7 +186,7 @@ A Database - + mock->db @@ -230,13 +230,13 @@ Page Layout - + check->layout - + template->layout @@ -261,7 +261,7 @@ - + persist->db diff --git a/src/pack/slides.html b/src/pack/slides.html index 751c2c8f4..66cfec136 100644 --- a/src/pack/slides.html +++ b/src/pack/slides.html @@ -222,7 +222,7 @@ ## Using a Theorem Prover -- An [%g theorem_prover "automated theorem prover" %] can do much better +- An automated theorem prover can do much better - But the algorithms quickly become very complex diff --git a/src/viewer/slides.html b/src/viewer/slides.html index f78340d81..b3075d0b2 100644 --- a/src/viewer/slides.html +++ b/src/viewer/slides.html @@ -144,7 +144,7 @@ [% inc file="move_cursor.py" keep="main" %] -- [%g Spread "spread" %] position into `stdscr.move` +- [%g spread "Spread" %] position into `stdscr.move` - Screen's `getkey` method returns names of cursor keys