diff --git a/01projects/index.md b/01projects/index.md index f8a554645..d0585c13b 100644 --- a/01projects/index.md +++ b/01projects/index.md @@ -1,19 +1,14 @@ --- -title: "Week 1: C++ projects" +title: "Introduction to C++" author: - Krishnakumar Gopalakrishnan --- ## Contents -In Week 1, we cover the following topics: +This week we will introduce you to the basics of writing, compiling, and running a program using C++. We will go through most things for this class with live demonstrations, but you may find it helpful to read over the notes ahead of class anyway, and you can refer back to them throughout the course. -- [Version control with Git](./sec01Git.html) -- [Building research software](./sec02SoftwareBuilds.html) (**mandatory pre-reading** material before in-class session) -- [CMake background](./sec03CMakeBackground.html) (**mandatory pre-reading** material before in-class session) -- [HelloWorld with CMake](./sec04CMakeHelloWorld.html) -- [Building 'HelloWorld'](./sec05BuildHelloWorld.html) - -## Programming environment +In this section we also provide notes on how to use **git** if you have not done so before. Git is a version control system that we will use throughout the course and for your coursework assignments, so be sure to read this through. -Please note that those who chose not to use the provided devcontainer in VSCode or GitHub codespaces shall have to install a text editor, a suitable compiler toolchain e.g. gcc/g++ (> 9.4.0), CMake (> 3.21), and git 2.x. +- [Version control with Git](./sec01Git.html) +- [Introduction to C++](./sec02IntroToCpp.html) diff --git a/01projects/sec01Git.md b/01projects/sec01Git.md index 4ee341dad..f8105216a 100644 --- a/01projects/sec01Git.md +++ b/01projects/sec01Git.md @@ -132,7 +132,7 @@ Changes to be committed: new file: hello.cpp ``` -Now, all three files are ready to be committed (i.e. made a permanently referencable entity of the project), and we employ the `git commit` command for this. +Now, all three files are ready to be committed (i.e. made a permanently referenceable entity of the project), and we employ the `git commit` command for this. ### `git commit` @@ -276,6 +276,6 @@ In fact, this is how we shall proceed with the devcontainer setup for our upcomi ## Further resources -We have only covered a very basic overview of the Git version control system that shall enable us to get started with the in-class exercises and course projects. An excellent resource that provides an expanded introduction is the Software Carpentry's [lessons on Git](https://swcarpentry.github.io/git-novice/) which covers some additional topics such as ignoring certain kind of files from being tracked, referencing previous commits in git commands etc. The sofware carpentry lesson material has been taught as a video playlist with live coding/demonstrator by your course instructor and is available [here](https://www.youtube.com/playlist?list=PLn8I4rGvUPf6qxv2KRN_wK7inXHJH6AIJ). +We have only covered a very basic overview of the Git version control system that shall enable us to get started with the in-class exercises and course projects. An excellent resource that provides an expanded introduction is the Software Carpentry's [lessons on Git](https://swcarpentry.github.io/git-novice/) which covers some additional topics such as ignoring certain kind of files from being tracked, referencing previous commits in git commands etc. The software carpentry lesson material has been taught as a video playlist with live coding/demonstrator by your course instructor and is available [here](https://www.youtube.com/playlist?list=PLn8I4rGvUPf6qxv2KRN_wK7inXHJH6AIJ). In professional software development, one usually encounters further advanced topics such as branching, rebasing, cherry-picking commits etc for which specialised git resources exist both online and in print. All UCL students have free access to content from LinkedIn Learning, and it is worthwhile to look into some of the [top rated Git courses](https://www.linkedin.com/learning/search?keywords=git&upsellOrderOrigin=default_guest_learning&sortBy=RELEVANCE&entityType=COURSE&softwareNames=Git) there. diff --git a/01projects/sec02IntroToCpp.md b/01projects/sec02IntroToCpp.md new file mode 100644 index 000000000..2851e556e --- /dev/null +++ b/01projects/sec02IntroToCpp.md @@ -0,0 +1,560 @@ +--- +title: Introduction to C++ +--- + +# Week 1: Introduction to C++ + +## What is C++? + +C++ is a relatively low-level*, relatively old, general-purpose programming language, commonly used for programs that require high performance. First released in 1985 by Bjarne Stroustrup, it was intended as a successor to C, primarily to add object-oriented features. Now, C++ is multi-paradigm, meaning that it includes features that enable object-oriented, functional, and procedural programming. It inherits many characteristics from C: it is compiled, has a static type system, and allows manual memory management. All these features ultimately enable C++ to perform extremely well; these same features can be tricky to use effectively and can be a source of bugs. It is the balance between programmer productivity and performance that allows C++ to be used to build large, complex, fast applications and libraries. + +Since 1998, C++'s development has been governed by an ISO working group that collates and develops new C++ features into a *standard*. Since 2011, a new standard has been released every 3 years, with compiler support for new features lagging behind by up to 3 years. + +This course will assume the use of the `C++17` standard, which is presently widely supported by compilers. (Compiler support tends to lag behind the release of the C++ standard, since the compiler developers need time to implement it and check that the new compilers work properly!) + +> \* The exact meaning of "low level" depends on whom you ask, but in general _lower level_ languages are languages more directly represent the way that the machine works, or give you control over more aspects of the machine, and _higher level_ languages abstract more of that away, and focus on declaring the behaviour that you want. Traditionally, a "low level" language refers to _assembly code_, which is where the instructions that you write are the instruction set of the machine itself! This code is by its nature specific to a given kind of machine and therefore not portable between systems, and doesn't express things in a way that is intuitive to most people. High level languages were introduced to make code easier to understand and more independent of the hardware; the highest level languages, like Haskell, are highly mathematical in their structure and give hardly any indication of how the computer works at all! C++ falls somewhere in the middle, with plenty of high level abstractions and portability, but it still gives us some features associated with low level programming like direct addressing of memory. This extra degree of control is very valuable when you need to get the best out of systems that require high performance or have limited resources. + +## Why are we using C++? + +The most common language for students to learn at present is probably Python, and many of you may have taken the Python based software engineering course last term. So why are we now changing to C++? + +1. C++ is the standard language for high performance computing, both in research and industry. +2. C++ code runs much faster than native Python code. Those fast running Python libraries are written in C! As scientific programmers, we sometimes have to implement our own novel methods, which need to run efficiently. +3. C++ is a great tool for starting to understand memory management better. + - Most code that we write will not need us to allocate and free resources manually, but C++ gives us a clear understanding of when resources are allocated and freed, and this is important for writing effective and safe programs. + - Many structures in C++ have easy to understand and well defined layouts in memory. The way that data is laid out in memory can have a major impact on performance as we shall see later, and interacting with some high performance libraries requires directly referencing contiguous blocks of memory. +4. C++ has strong support for object-oriented programming, including a number of features not present in Python. These features allow us to create programs that are safer and more correct, by allowing us to define objects which have properties that have particular properties (called _invariants_). For example, defining a kind of list that is always sorted, and can't be changed into an un-sorted state, means that we can use faster algorithms that rely on sorted data _without having to check that the data is sorted_. +5. C++ is multi-paradigm and gives us a lot of freedom in how we write our programs, making it a great language to explore styles and different programming patterns. + +### Why is C++ fast? + +Because a C++ program is compiled before the program runs, it can be much faster than interpreted languages. Not only is the program compiled to native machine code, the lowest-level representation of a program possible with today's CPUs, compilers are capable of performing clever optimisations to vastly improve runtimes. With C++, C, Rust, Fortran, and other natively compiled languages, there is no virtual machine (like in Java) or interpreter (like in Python) that could introduce overheads that affect performance. + +Many languages use a process called _garbage collection_ to free memory resources, which adds run-time overheads and is less predictable than C++'s memory management system. In C++ we know when resources will be allocated and freed, and we can run without less computational overhead, at the cost of having to be careful to free any resources that we manually allocate. (Manually allocating memory is relatively rare in modern C++ practices! This is more common in legacy code or C code, with which you will sometimes need to interact.) + +Static type checking also helps to improve performance, because it means that the types of variables do not need to be checked during run-time, and that extra type data doesn't need to be stored. + +### Should I just use C++ for everything? + +Probably not! + +Choosing a programming language is a mixture of picking the right tool for the job, and the right tool for you (or your team). Consider some of the pros and cons for a given project you are working on. + +C++ Pros: +- Can produce code which is both fast and memory efficient. +- Very flexible multi-paradigm language supports a lot of different approaches. +- Gives you direct control of memory if you need it. +- Large ecosystem of libraries. +- Can write code which runs on exciting and powerful hardware like supercomputing clusters, GPUs, FPGAs, and more! +- Can program for "bare metal", i.e. architectures with no operating system, making it appropriate for extremely high performance or restrictive environments such as embedded systems. +- Static typing makes programs safer and easier to reason about. + +C++ Cons: +- Code can be more verbose than a language like Python. +- C++ is a very large language, so there can be a lot to learn. +- More control also means more responsibility: it's very possible to cause memory leaks or undefined behaviour if you misuse C++. +- Compilation and program structure means there's a bit of overhead to starting a C++ project, and you can't run it interactively. This makes it harder to jump into experimenting and plotting things the way you can in the Python terminal. + +For larger scale scientific projects where performance and correctness are critical, then C++ can be a great choice. This makes C++ an excellent choice for numerical simulations, low-level libraries, machine learning backends, system utilities, browsers, high-performance games, operating systems, embedded system software, renderers, audio workstations, etc, but a poor choice for simple scripts, small data analyses, frontend development, etc. If you want to do some scripting, or a bit of basic data processing and plotting, then it's probably not the best way to go (this is where Python shines). For interactive applications with GUIs other languages, like C# or Java, are often more desirable (although C++ has some options for this too). + +I'd also like to emphasise that while we _use_ C++, the goal of this course is not to simply teach you how to write C++. **This is a course on software engineering in science, and the lessons should be transferable to other languages.** Languages will differ in the features and control that they offer you, but understanding how to write well structured, efficient, and safe programs should inform all the programming that you do. + + +# Writing in C++: Hello World! + +Here's a little snippet of C++: + +```cpp +#include + +using namespace std; + +int main() +{ + cout << "Hello World!\n"; + + return 0; +} +``` + +We'll go through exactly what each line and symbol mean in a moment, but let's first run this program. Since C++ is a *compiled* language, we must use a compiler to turn this code into an executable that the computer can run. In this course we'll nearly always be using the GNU C++ compiler, `g++`. We can compile the above code pasted into a file called `main.cpp` into an executable called `hello_world` by running in the terminal: + +```bash +g++ main.cpp -o hello_world +``` + +> `.cpp` is the standard suffix for C++ files. You will also encounter `.h` and `.hpp` suffixes, which describe *header files* (we'll come to those later) and sometimes `.c`, describing just a C file. + +Now we can actually *run* the program with `./hello_world`[^dot_slash]. You should see output similar to: + +``` +[my_username@my_hostname my_current_folder]$ ./hello_world +Hello World! +``` + +> The `./` syntax before the executable name is required because when a command (or executable name) is entered in Linux, it does not automatically search the current directory for the entered command. We need to tell the Linux shell that the command is inside the current directory, which is labelled `.`, hence the full command `./hello_world` is telling the Linux shell to "run the executable `hello_world` in the current directory". + +Congratulations, you have just run your first C++ program! + +## Deconstructing Hello World + +At the highest level, this C++ code has 3 parts, an include statement, a using statement and a single function called `main`. Let's go through each piece individually. + +**The include statement** + +The first line we see in Hello World is an *include statement*: + +```cpp +#include +``` + +We'll go into what *exactly* this line is doing later but all you need to know at this stage is that include statements are how we import external code into the current file so we can use the included functions, classes, etc in our code. This is similar to `import` in Python, Java and Matlab. In the specific line above we're including the code from `iostream`, the piece of the C++ standard library that provides input-output functionality. We require this to use `cout` later. The *hash* or *pound* symbol `#` and the angle brackets `<...>` will be explained later when we get into the *preprocessor*. + +**The using statement** + +The next line is a *using statement*: + +```cpp +using namespace std; +``` + +Again, we'll explore what this whole line actually does later but the short version is it allows us to access functions and classes inside the standard library without typing `std::` everywhere. I recommend you use it but **only in .cpp files**. + +There's also an interesting bit of punctuation here; you have probably noticed the line ends in a semicolon `;`. C++, and many other languages, require this because the language doesn't care about (most) *whitespace*. Technically we can write our whole program all on one line, ignoring all indentation and newlines: + +```cpp +#include using namespace std; int main() {cout << "Hello World!\n"; return 0;} +``` + +But this looks horrible so we use newlines and indentation in appropriate places to make our code more *readable*. + +This is very different to languages where whitespace has meaning, like Python, where a *newline* usually denotes the end of a _statement_, and indentation denotes a new code block or scope. Because of this, C++ requires all *statements* to end with a semicolon so that the compiler knows exactly where a line is meant to end. In practice this means nearly every line will end with a semicolon, except lines starting with `#` (we'll come back to that) and lines that open or close *scopes* with the curly braces `{}`. + +**Functions** + +Here's our `main` function again: + +```cpp +int main() +{ + ... + return 0; +} +``` + +This is made up of three main parts: + +1. The *function signature*, `int main()` +2. The *function body*, the code inside the curly braces `{}`, and +3. The *return statement* inside the function body. + +We'll go through each of these pieces. + +**The function signature** + +The line `int main() {` begins a *function definition* and includes the *function signature* (the piece without the curly brace). Here we're describing a function called `main` that returns a single value of type `int`, and takes no parameters or arguments, which we know because the brackets `()` are empty. A C++ program that is intended to be compiled into an executable (as opposed to a library; more on that later) must contain a function called `main`: this is the entry-point to our program. **`main` is the function which is executed when the program starts.** + +The signatures of other functions might look like: + +A function called `to_string` that takes an `int` called `x` and returns a `string`: + +```cpp +string to_string(int x) +``` + +A function called `is_larger` that takes two `float`s and returns a `bool`: + +```cpp +bool is_larger(float a, float b) +``` + +A function called `save_to_file` that takes a `vector` (a kind of array) of `int`s and has *no return value*, which we denote with `void`: + +```cpp +void save_to_file(vector my_data) +``` + +**The function body** + +The function body is simply all code within the curly braces. This is the code which defines what the function actually does, and what value(s) (if any) it returns. Let's look at the statement that prints "Hello World!": + +```cpp +cout << "Hello World!\n"; +``` + +First, `cout` is an example of what is called an *output stream*; `cout` will normally write to your terminal. (Other kinds of streams exist, including for reading and writing files.) In essence, `cout << string` is equivalent to `print(string)` in Python, although there are some nuances that will be discussed when we discuss *operators* like `<<` in more detail. For now, all you need to know is `cout << ...` is how we print to the standard output in C++. + +You'll notice the string that we want to print is surrounded by double-quotes `"`. These are used to write *string literals* in code, while single-quotes `'` are used to write *character literals* like `'a'`, `'/'` or `'@'`. When inside the string literal, we use the special *newline character* `'\n'` to make the program print a newline. Although we type this as two characters, a `\` and an `n`, C++ interprets this as a character with special meaning. See [Escape Characters](https://en.wikipedia.org/wiki/Escape_character) for more information. + +**The return statement** + +In our main function, the return statement is: +```cpp +return 0; +``` +This statement is required in every function, except functions with a `void` return type (however even then a function can use `return;` without a value to end the function early). If we reach a `return` statement our function will terminate and our program will continue executing from the place where we called our function, even if there is more code after the `return` statement. + +This particular return statement is inside our `main` function which expects to return an `int` to its calling code. Since this is the return value from the special function `main`, this value is interpreted by the operating system as a *status code*, a number between 0 and 255 that gives some information about the success or failure of the application. + +Like most common languages, functions are called like: +```cpp +int main() +{ + string x_as_string = to_string(3); +} +``` +where the *return value* is, here, assigned to the variable `x_as_string`. We can call functions without using the return value by just not assigning the function call: +```cpp +int main() +{ + to_string(3); +} +``` + +**Types** + +We've already come across, `int`, `float`, `bool` and other keywords that define the *type* of variables and parameters. C++ is a *strongly-typed* and *statically-typed* language, so the *types* of every single variable or function parameter must be known at *compile time*, and those types cannot change during runtime. + +> **Compile time** refers to the time at which the code is compiled. It's used in contrast to **runtime** or **run time** which refers to the time at which the program is run. For example, if a program requires a number to do some computation and you can write that number in the source code itself, that number is known at *compile time*. If, say, the user needs to input that number, the value is only known at *run time*. + +This is in contrast to the *weakly-typed* and *dynamically-typed* Python where we can define a variable `x` and assign it the integer value `2`, then reassign a string `"2"` to the same variable! + +```python +x = 2 +x = "2" +``` + +In C++ this kind of code will produce an error because, to reiterate: + +1. We must give all variables a type, and +2. We cannot change the type of a variable + +Again, we'll explore later what types are available in C++ (and how we can create our own) but a useful initial list is: + +- `bool`: a boolean value, i.e. `true` or `false` +- `int`: an integer value, e.g. `-4, 0, 100` +- `float`: a 32-bit floating-point value `-0.2, 0.0, 1.222, 2e-3` +- `double`: a 64-bit floating-point value (same as `float` but can represent a greater range and precision of real numbers) +- `char`: a single character, e.g. `'a', 'l', ';'` +- `string`: a kind of list of characters, used to represent text. + - Not to be confused with a *character array* which can be difficult to deal with. +- `std::vector`: a kind of array of elements of type `T`, e.g. `std::vector {1, 100, -1}` declares a vector of integers. + - Not to be confused with a *mathematical* vector, this is similar to a Python `list`. + - `std::` means that the vector type is part of the **C++ Standard Library** namespace. If you have the line `using namepsace std;` in your code then you don't need to use the `std::` prefix after that. Just bear in mind that this could cause name clashes if you delare another type or function which has the same name as something in the standard library! + - **N.B.** you need to add `#include ` to the top of your program to use the vector type. + +## Basic control structures + +C++ contains many of the same control structures as other programming languages that you have used previously. + +### Conditional logic using `if`/`else` statements + +A true/false value is called a Boolean value, or `bool`. Conditional statements test the value of a Boolean value or expression and execute the following code block if it is `true`. (Remember that a code block is contained within curly braces `{}`, and can be as large as you like.) + +```cpp= +// if statement with a Boolean variable +if(condition) +{ + std::cout << "Condition was true!" << std::endl; +} + +// if statement with a Boolean expression +if(x < 10) +{ + std::cout << "x is too small." << std::endl; +} + +// can also be a function which returns a bool! +if(f(x)) +{ + std::cout << "f(x) was true!" << std::endl; +} +``` +In the examples above, nothing will happen if the statement inside the brackets is not true. + +If you want something to happen when the statement is false, you can also use `else` and/or `else if` statements. + +```cpp= +if(x < 10) +{ + std::cout << "x is small" << std::endl; +} +else if(x > 50) +{ + std::cout << "x is large" << std::endl; +} +else +{ + std::cout << "x is neither large nor small." << std::endl; +} +``` + +### Loops (`for` and `while`) + +```cpp= +for(uint i = 0; i < 100; ++i) +{ + // loop code goes here +} +``` +- The brackets after the `for` set up three things: + - first we declare a variable, if any, that we want to use for the loop. + - next we have the loop condition; the loop continues while this is still true. + - finally we have a statement which should execute at the end of each loop iteration. + - In this case, we execute the loop 100 times, with `i` taking the values `0` to `99`. + - The variable `i` is available inside the loop. +- `uint` is a type for _unsigned integers_, which are integers that cannot be negative. It's a good idea to use these for counting and other values which shouldn't be less than 0. +- `++i` increments the value of `i` by 1. + +If we have a `vector` or similar container, we can loop over its elements without writing our own loop conditions: +```cpp= +#include + +int main() +{ + + std::vector v(10); // declare a vector of ints with 10 elements. + + for(int &x : v) + { + std::cout << x << std::endl; + } + +} +``` +- The `for` loop iterates over the elements of `v`. +- At each iteration, the variable `x` is given the value of the current element. +- This is a good way to iterate over containers when we don't need to refer to indices explicitly, as it avoids possible programmer errors! + +`while` loops have simpler syntax than `for` loops; they depend only on a condition, and the code block executes over and over until the condition is met. This is useful for situations where the number of iterations is not clear from the outset, for example running an iterative method until some convergence criterion is met. + +```cpp= +while( (x_new - x_old) > 0.1) // convergence criterion +{ + x_old = x_new; + x_new = f(x_old); //iteratively call function f on x until it converges +} +``` + +## Programs With Multiple Files + +Like other programming languages, it is possible (and good practice!) to break up C++ programs into multiple files. C and C++ however have a slightly unusual approach to this compared to some other languages. Let's consider that we have two C++ files, our `main.cpp` which contains our `main` function (the entry point for execution of our program), and another which defines some function that we want to use in `main`. + +**main.cpp:** +```cpp= +#include + +int main() +{ + // Call function f + int x = f(5, 2); + + std::cout << "x = " << x << std::endl; // endl stands for end line + + return 0; +} +``` + +**function.cpp:** +```cpp= +int f(int a, int b) +{ + return (a+2) * (b-3); +} +``` + +We could try to just compile these files together: +``` +g++ -o ftest main.cpp function.cpp +``` +but this won't work! +In order to compile `main.cpp`, we need to know some information about the function `f` that we are calling. + +Let's take a moment to understand C++'s compilation process a little better. +- C++ is designed so that different source files can be compiled **independently, in parallel**. + - This helps to keep compilation times down. + - It also means that if we only change one part of the program, we only need to re-compile that part. The rest of the program doesn't need to be compiled again! +- Remember though that C++ is _statically typed_, which means **the compiler needs to know the types of all variables and functions used in the code it is compiling, at compile time**. Otherwise it cannot check that the code you have written is correctly typed! + - Take for example the statement `int x = f(5, 2);` in `main.cpp`. In order for this to be correctly typed, we need to know that `f` can accept two numbers as its arguments, and it must return an integer, because `x` is declared to be an `int`. If we don't know the type of `f`, we can't be sure that this is true! + +Let's use this simple example program to explore how the compiler deals with our code, and what information it needs to do its job. + +### Code Order and Declarations + +We'll start with a single file and work towards a multiple file version. Consider the following two versions of the same program: + +**Version 1** +```cpp= +#include + +int f(int a, int b) +{ + return (a + 2) * (b - 3); +} + +int main() +{ + int x = f(5, 2); + + std::cout << "x = " << x << std::endl; + + return 0; +} + +``` + +**Version 2** +```cpp= +#include + +int main() +{ + int x = f(5, 2); + + std::cout << "x = " << x << std::endl; + + return 0; +} + +int f(int a, int b) +{ + return (a + 2) * (b - 3); +} +``` + +Only the first of these two programs will compile! +- **C++ will parse your file in order**, and so in the second version it comes across the function `f` _before_ it has been defined. The compiler doesn't know what to do! It can't know what `f` is supposed to be, and if this is a valid & type-safe statement. +- C++ does **not** need to know everything about `f` ahead of time though; it just need to know _what_ it is and what its type is. This is the job of **forward declaration**: something that tells us what the type of the symbol is without telling us exactly what it does. In this case, this would be a **function declaration**, but we can also have declarations for other things in C++, as we shall see later on in the course. + +**With a function declaration:** +```cpp= +#include + +// Function declaration for f +int f(int a, int b); + +int main() +{ + int x = f(5, 2); + + std::cout << "x = " << x << std::endl; + + return 0; +} + +int f(int a, int b) +{ + return (a + 2) * (b - 3); +} +``` + +- Line 4 is the **function declaration**. + - This defines the name `f` as a function that will be used in this program. + - It tells us that `f` takes two `int` arguments, and returns an `int`. + - It does not define _what_ `f` will do or how its output is calculated. That can happen later! + - It's worth knowing that the names `a` and `b` aren't required in a declaration; since we're not defining the behaviour here, we don't need to be able to refer to the arguments individually. `int f(int, int);` is an equally valid function declaration. Nevertheless, we usually include argument names in declarations because it makes them easier to understand and use, especially if the arguments have informative names! +- Line 15-18 is the **function definition**. + - This contains the actual code which is executed when the function is called. The compiler doesn't need to know how the function `f` works in order to compile `main` because it knows that the types are correct, but in order to finish building the program it will need a definition for `f`! + +This program _will_ compile, because the compiler knows when it reaches main that `f` is a symbol which stands for a function which takes two `int`s and returns an `int`. This means that it can deduce that `int x = f(5, 2);` is a valid statement. When it reachs the definition on `f` at line 15 it is then able to create the code for that function. + +This might seem like a rather pointless thing to do in a program as trivial as this, but it's a very important step towards writing programs in multiple files. + +Now that we know that we can write function declarations, we can move the function definition to a different file, and compile both files separately. + +**main.cpp**: +```cpp= +#include + +int f(int a, int b); + +int main() +{ + int x = f(5, 2); + + std::cout << "x = " << x << std::endl; + + return 0; +} +``` + +**function.cpp**: +```cpp= +int f(int a, int b) +{ + return (a + 2) * (b - 3); +} +``` + +To compile these files separately we can use the `-c` flag: +``` +g++ -c main.cpp +``` +will compile the code for `main` into a `.o` file (`main.o`), known as an **object file**. We can compile `function.cpp` into an object file, `function.o`, in the same way. +``` +g++ -c function.cpp +``` +- **Object files** are code which has been compiled but which only form partial programs. We can't execute `main.o` because the definition of `f` is missing from `main.cpp`! The program wouldn't know what to do when it reaches `f`. +- In order to create an executable which we can run, we need to **link** the object files, using the **linker**. This gives the compiler the definition of all the functions it needs, so then it can create the machine commands to jump to the executable code for `f` whenever it is called, and jump back when it has finished. + +``` +g++ -o test_f main.o function.o +``` +This command will produce an executable, `test_f`, by linking the two object files `main.o` and `function.o`. In the final compiled executable, the code from `function.o` is run when `f` is called in `main`. + +For a simple project like this, we can compile an executable in one step by providing both source files to the compiler at the same time: +``` +g++ -o test_f main.cpp function.cpp +``` + +### Header Files + +Forward declarations for functions are helpful, but they can still clutter up our code if we are making use of large numbers of functions! Instead, we put these declarations in **header files**, which usually end in `.h` or `.hpp`. We use `#include` to add header files to a `.cpp` file: this allows the file to get the declaration from the header file. The definitions are not kept in the header file, they are in a separate `.cpp` file. + +In this case the files look as follows: + +**function.h**: +```cpp= +int f(int a, int b); // function declaration +``` + +**function.cpp**: +```cpp= +int f(int a, int b) +{ + return (a + 2) * (b - 3); +} +``` + +**main.cpp**: +```cpp= +#include +#include "function.h" // include our header file with the declaration + +int main() +{ + int x = f(5, 2); + + std::cout << "x = " << x << std::endl; + + return 0; +} +``` + +You can compile as before, if your include file is in the same folder: +``` +g++ -o test_f main.cpp function.cpp +``` + +If your include file is in a different folder, your need to tell the compiler where to find it using the `-I` option: +``` +g++ -o test_f main.cpp function.cpp -Iinclude_folder/ +``` + + + +## Useful References + +- I'd highly recommend Bjarne Stroustrup's _A Tour of C++_. This comes in many different editions, covering different standards of the language, so try to use one from `C++17` onwards! This is available online from the UCL library services. +- The [C++ core guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines) are a fantastic resource for learning more about writing good quality C++ code, and some of the nitty gritty details of the language that can easily trip people up. It's rather dense, so it's best to use this to search for answers to questions you already have than just trying to read it through!! +- The [Google C++ style guide](https://google.github.io/styleguide/cppguide.html) is also interesting, and offers good explanations for their choices. If you do use this as a resource, **don't simply adopt their practices, but read their reasons first!** Some of their reasons will absolutely not apply to you and the projects that you work on, so make sure that you're always making informed choices. diff --git a/02cpp1/index.md b/02cpp1/index.md index 3182804c6..95ec89a05 100644 --- a/02cpp1/index.md +++ b/02cpp1/index.md @@ -1,26 +1,28 @@ --- -title: "Week 2: Modern C++ (1)" +title: "Week 2: Custom Data Types and (a glimpse of) the Standard Library" --- ## Week 2: Overview ### The Story So Far -* Git -* CMake +* Intro to C++ +* Compiling with g++ and CMake +* Variables +* Functions +* Loops and Conditionals +* Console Output -### Todays Lesson +### This Week -Today we'll be covering come core features of the C++ language, in particular aspects of the standard library. We won't be covering the very basics of the language, so if you are unfamiliar with things like variables, loops, and functions in C++ you should look at some introductory C++ material first (for example the LinkedIn Learning course recommended in the Moodle). +This week we'll be further our knowledge of core features of the C++ language and how these relate to clear and efficient code design. This week will cover: -[Common Standard Library Features](./sec01StandardLibrary.html) +[Type Systems](./sec01Types.html) -- Input / Output -- Containers & Iterators -- Algorithms & Anonymous Functions -- Understanding C++ Documentation +- A quick introduction to what types are, and how they are used in C++ in particular. +- Options for user-defined types. [Passing by Value and by Reference](./sec02PassByValueOrReference.html) @@ -28,9 +30,21 @@ This week will cover: - Exracting values from functions by return statements or references - Usage of the `const` keyword -[Pointers](./sec03Pointers.html) +[Understanding Classes and Objects](./sec03ObjectOrientedProgramming.html) + +- Classes as user-defined types. +- Objects as instances of classes. +- Access Specifiers +- Inheritance and Introductory Polymorphism + +[Common Standard Library Features](./sec04StandardLibrary.html) + +- Containers & Iterators +- Algorithms & Anonymous Functions +- Understanding C++ Documentation + +[Pointers](./sec05Pointers.html) -- Memory in C++ -- Data ownership models -- Smart pointers for data ownership -- Raw pointers +- Unique, shared, weak, and raw pointers +- Introductory memory management +- Concepts of data ownership \ No newline at end of file diff --git a/02cpp1/sec01Types.md b/02cpp1/sec01Types.md new file mode 100644 index 000000000..8674cffe8 --- /dev/null +++ b/02cpp1/sec01Types.md @@ -0,0 +1,80 @@ +--- +title: Types +--- + +# Types in C++ + +In this section we will cover some core ideas that you will need to understand to program effectively in C++ and similar languages. We will assume that you have some programming experience although not necessarily in C++, with the expectation that Python is the most commonly known language. As a result, a few things may need to be explained before proceeding with writing C++ code. + +## Type Systems + +Type systems are an enormous topic, but we should understand a little bit about types in order to know how to program in different languages. Almost every high-level programming language is typed in some way. + +- Types denote what kind of data a variable represents. All data in a computer is just `1`s and `0`s, so we need to know how to _interpret_ data in order to use it. Types can make this process easier, since we don't have to manually remember what each memory location is supposed to be representing and enforce that it is treated appropriately. + - Consider for example how to print some data to the screen. The same sequence of bits at a memory location has a very different meaning if it's an integer or a series of characters. +- An important property in many languages is **type safety**. This is a property of programs that essentially tells us that we are never using data meant to represent one type in a place where data of another type is expected. For example, we are never passing a string to a function which is meant to manipulate integer. This eliminates a large class of bugs! +- Programming languages may be statically or dynamically typed. + - In a **statically typed** language the type of a variable is decided at the declaration and it cannot be changed. To change a variable from one type to another requires another variable to be declared and an appropriate conversion to be defined. This is how typing works in C++, C, Rust, and many others. + - In a **dynamically typed** language the type of a variable can change throughout the program run-time. (Remember that a _variable_ is a handle, generally for some _data_ which resides in memory.) Instead of applying a type to the variable, the data itself is tagged with a type. A variable can have its data changed, including its type, for example from an `int` to a `string`. This is how typing works in a language like Python, Javascript, or Julia. + +### C++ Types + +Types in C++ are static, and type correctness is checked at _compile time_. This means that if your program compiles correctly, you will not encounter type errors at runtime.* + +Although dynamically typed programming languages like Python will prevent poorly typed statements from executing at runtime, there is no way to know whether your program contains type errors until you crash into one. Part of the problem is that functions can return different types depending the input or program state, meaning that you can't necessarily be sure that the thing that you think is calculating an integer is definitely going to give you an integer every time. Variables may also have their type changed by side effects after being passed to a function in a dynamically typed language. +- These problems are not uncommon: conversion of a variable from an integer type to a floating point type under some circumstances is easy to do in Python. Floating point and integer types are interchangeable in most Python code but behave differently (integer arithmetic is exact while floating point is approximate, for example) and you may not discover the conversion has happened until you try to use the variable somewhere that a float cannot be used, such as indexing an array. Because the conversion is silent and valid in a dynamically typed language, it can be extremely hard to find _where_ the conversion happened in a large program, as the problem could originate a long way away from where the type error gets raised! + +Type systems can be leveraged to ensure many kinds of safety properties in programs because information can be built into custom types. Examples of this might be ensuring at compile time that matrices in matrix multiplications are compatible (recall that a $X \times Y$ matrix can be multiplied by a $Y \times Z$ matrix), or that physical dimensions are consistent (e.g. a velocity has units $\text{Length} \times \text{Time}^{-1}$, so $v = \frac{d}{t}$ is a valid expression but $v = \frac{d^2}{t^2}$ is not). + +> \* Technically C++, like a number of other languages, has some features which are not type-safe. It is possible in C++ to subvert the type system by using some low level memory operations, but there is almost never a reason to do this so you're unlikely to see this in practice and you shouldn't do it in your own code. Attempting to do this kind of manipulation usually results in **undefined behaviour**, so you won't necessarily even be able to predict what your program will do unless you know exactly how your compiler turns your source into machine code! + +## C++ Types and Declaring Variables + +In C++ when declaring variables we do so by first declaring the type, then the name of the variable, and then its value. For example: + +``` cpp +int x = 5; +``` + +- This declares a variable `x` of type `int` with value `5`. +- Some types have default initial values, which would mean that you don't have to supply the value explicitly. +- Some types can be declared _unitialised_, which means that the memory for that variable is reserved but not initialised. It will contain whatever bits were already there! It's a good idea to initialise variables explicitly. + +Types in C++ can sometimes be verbose or complicated, and it is sometimes easier to read and write code which makes use of the `auto` keyword. This keyword tells the C++ compiler to deduce the type for us. This is called _type inference_ and was made a feature of C++ in C++11, so will be absent in older codes. + +``` cpp +auto x = 5; +auto y = std::abs(-13.4); +``` + +- `auto` can usually deduce the type from the result of an expression by looking at the types of all the variables and functions within it. + - For example here it interprets `5` as an integer and therefore deduces that `x` is an int. + - It will deduce that `y` is a `double`, since `-13.4` must be a floating point type (`double` by default) and `std::abs` returns a `double` when given a `double`. + - Be especially careful when values can be ambiguous. Here `5` is being assigned as an int, but `5` is also a valid `float` or `double`. **If you want a specific type in cases like this you should always specify it explicitly**. +- `auto` doesn't always work: the compiler must be able to deduce the type from contextual information. + - You cannot declare an unitialised variable with `auto` e.g. `auto z;` will lead to a compiler error as it won't know what type is intended for `z`, even if `z` is later assigned. +- You cannot use `auto` when declaring the return types or parameter types of functions, you must always declare these type explicitly. + - It's generally a good idea therefore to know what the types of variables in your code are, even if you choose to use the `auto` keyword! This will make writing your own functions, and knowing what functions you can pass your data to, much easier. +- In an IDE like VSCode you can inspect the type of a variable by hovering the mouse over that variable. If you've used `auto` it will be able to tell you what type has been deduced for it. +- Bear in mind that `auto` can make your code more concise, but can also make your code harder to understand. Sometimes it's better to write your type explicitly so that people reading your code can immediately understand what the types of your variables are. + +## Defining Custom Types + +Custom types are an important feature in typed languages in order to be able to represent and manipulate more complex data in a type-safe way. In C++ the most common way to define a new type is to declare a `class` (or equivalently a `struct`). Classes are a common feature of **Object Oriented Programming**, which is a popular approach to programming in C++. (Some examples of other languages with classes for object oriented programming are C#, Java, and Python.) We'll discuss the design and use of classes in the next section, so for now let the following suffice: +- A class is a custom data type which is defined by the programmer. It can contain any number of variables and functions. +- Once it is defined it can be used like any other type, e.g. it can be accepted as an argument in, or returned from, a function. Type safety rules still apply. +- Classes give us a way of defining sub-types which are _substitutable_. For example we can define a `Shape` type, and then have `Circle` and `Square` sub-types which are accepted by the type system anywhere where a `Shape` type is accepted. This makes our type system more flexible and expressive. We discuss classes in detail in a later section of this week's notes. + +We will focus overwhelmingly on classes as our means of defining custom types, but for those who are interested there are two further ways of declaring custom types in C++: +- `enum`: This stands for _enumeration_. An `enum` is a type which can take one of a finite set of values (i.e. the values are _enumerable_). Each of these values must have a name, for example let's say we want a `Colour` enum which can take the values `red`, `green`, and `blue`. We can declare a new `enum` called `Colour` in two ways: + - `enum Colour {red, green, blue};`. This kind of enum implicitly converts the values `red`, `green`, and `blue`, to `1`, `2`, and `3` respectively, and the `Colour` type can be used interchangeably with `int`. + - Because this type of `enum` is interchangeable with `int`, it can be used to e.g. index an array. This can be useful when you want to efficiently store data based on categorisations. For example, say you have data about some population, split up by gender and age group. By turning your gender categories and age groups into enums, you can then store your data as a matrix which is indexed like `data[gender][age_group]`. + - For this kind of enum we can just reference these values using the names `red`, `green`, and `blue`. + - `enum class Colour {red, green, blue};`. This kind of enum (called an `enum class`) cannot be used interchangeably with `int`, and therefore `Colour` can only be used in places that are explicitly expecting a `Colour` type. **We usually want to use an `enum class` so that we don't accidentally mix it up with integer types!** + - This cannot be used to index arrays (because it is not an int), but it can be used as a key in `map` types. `map` and `unordered_map` provide C++ equivalents to Python's dictionary type. + - In order to use these values we have to also include the class name, so we have to write `Colour::red`, `Colour::green`, or `Colour::blue`. +- `union`: Union types are types which represent a value which is one of a finite set of types. A `union` is declared with a list of members of different types, for example `union IntOrString { int i; string s; };` can store an `int` or a `string`. When a variable of type `IntOrString` is declared, it is only allocated enough memory to store _one_ of its members at a time, so it cannot store both `i` and `s` at the same time. The programmer needs to manually keep track of which type is present, often using an auxiliary variable, in order to safely use union types. Given this additional difficulty, **I wouldn't recommend using union types without a very strong reason.** + +Microsoft has excellent, and accessible, resources on [`enum`](https://learn.microsoft.com/en-us/cpp/cpp/enumerations-cpp?view=msvc-170) and [`union`](https://learn.microsoft.com/en-us/cpp/cpp/unions?view=msvc-170) types if you are interested in learning more about them. + +**N.B.** C++17 onwards also has a special class called `std::variant` which is designed to replace `union` types in a more type-safe way, because the `variant` can be checked to see which type it is currently holding. (That said, checking which type the variant has is still rather clunky, and you have to check for each type manually so if there are many cases it can be easy to miss one and the compiler will not warn you!) Ultimately, union / variant types are not terribly common in C++ code in practice, although some languages (especially functional languages like ML and Haskell) handle these concepts much more naturally. If you're interested in this kind of approach to types, I recommend reading up on [algebraic datatypes](https://en.wikipedia.org/wiki/Algebraic_data_type). diff --git a/02cpp1/sec02PassByValueOrReference.md b/02cpp1/sec02PassByValueOrReference.md index 704f66720..1da826948 100644 --- a/02cpp1/sec02PassByValueOrReference.md +++ b/02cpp1/sec02PassByValueOrReference.md @@ -5,12 +5,50 @@ title: Pass by Value and Pass by Reference Estimated Reading Time: 15 minutes # Pass by Value and Pass by Reference + +Variables are handles for data, which needs to be stored in memory somewhere so that it can be read and modified. -Variables can be passed to functions in two ways in C++: "pass by value" and "pass by reference". This choice significantly alters both the behaviour and the performance of a function. +In C++ we can find out the memory address of a given variable using the `&` operator, as in the following code snippet: + +```cpp +#include + +int main() +{ + int x = 15; + std::cout << "x is stored at address " << &x << " and has value " << x << std::endl; + x += 1; // Increment the value of x by 1. + std::cout << "x is stored at address " << &x << " and has value " << x << std::endl; + return 0; +} +``` +- Declaring `int x = 15` tells the compiler that we will need to store some data which is to be interpreted as an integer with an initial value of 15. The compiler then knows: + - How much memory to allocate for the variable (for an `int` this is normally 4 bytes). + - How the variable can be used in the remainder of the program (see the notes on types for more information). + - To set the data at that memory location to be equal to `15` at this point in the program. + - From now on, in this scope, `x` will refer to the value stored at this memory location, and changes to `x` will lead to changes to the data stored there. +- `&x` gives us the _address_ in memory where the value for `x` is stored. An address is a unique (usually 8 byte) numeric value for each memory location where we can store data. Changes to the value of `x` do not change its address: **the address of a variable cannot be changed**. +- The compiler will select an unoccupied memory address to store the variable in for you. + +The output of this program is +``` +x is stored at address 0x7ffe13999ed4 and has value 15 +x is stored at address 0x7ffe13999ed4 and has value 16 +``` +- The value of the address is given in [hexadecimal](https://en.wikipedia.org/wiki/Hexadecimal), which is prefixed by `0x` when printed. +- The address will be system dependent but note that the address does not change, but the value does. + +Given this, we might wonder what it means to "pass a variable to a function". Does the function need the _value_ that the variable represents, or the _location_ where the variable is stored? + +There are broadly two approaches to giving functions access to variables from the calling scope: +1. The function has a _copy_ of the variable's _value_ in a new variable, independent from the original. This means read/write operations to this variable in the function operate on a separate memory location which does not affect the original. Once we return from the function scope, the original variable is unchanged because the function did not have access to the memory location where it is stored. +2. The function has a _reference_ to the variable, which is to say it is given the memory location where the variable is stored. Any read/write operations are made to this same location, and therefore when we leave the function scope any changes made to the variable inside the function will still remain. + +Programming languages vary in their approaches to passing variables, but C++ gives us the choice when we define a function to pass each variable in either way: the first is called **pass by value** and the second is **pass by reference**. This choice significantly alters both the behaviour and the performance of a function in ways that we shall detail in the following sections. ## Pass by Value -Pass by Value means that we copy the value of the variable we want to pass into the function, and the function works on this copy and leaves the original alone. Any changes that the function makes to the variable will not affect the value of that variable once you leave the function's scope. To pass by value you simply write the type and name of the variable in the function parameters in the usual way. +Pass by Value means that we copy the value of the variable we want to pass into the function, and the function works on this copy and leaves the original alone. Any changes that the function makes to the variable will not affect the value of that variable once you leave the function's scope. In C++, passing by value is the default, so to pass by value you simply write the type and name of the variable in the function parameters in the usual way. ```cpp int add(int a, int b) @@ -18,16 +56,75 @@ int add(int a, int b) return (a + b); } ``` -This function can be safer, but is not time or memory efficient if variables are complex or large in size. You should only use pass by value for large pieces of data if you need an explicit copy made to work on locally in the function body. +This function can be safer, but is not time or memory efficient if variables are complex or large in size as the values need to be copied to new memory locations. You should only use pass by value for large pieces of data if you need an explicit copy made to work on and change locally in the function body but you cannot allow the function to change the original. + +We can see this call by value in action explicitly in the following code example by checking the address and values of variables: +```cpp +#include + +using namespace std; + +int add(int a, int b) +{ + cout << "In add function, before adding." << endl; + cout << "a is stored at address " << &a << " with value " << a << endl; + cout << "b is stored at address " << &b << " with value " << b << endl; + + a = (a + b); + + cout << "In add function, after adding." << endl; + cout << "a is stored at address " << &a << " with value " << a << endl; + cout << "b is stored at address " << &b << " with value " << b << endl; + + return a; +} + +int main() +{ + int x = 12; + int y = 9; + + cout << "Before add function." << endl; + cout << "x is stored at address " << &x << " with value " << x << endl; + cout << "y is stored at address " << &y << " with value " << y << endl; + + int z = add(x, y); + + cout << "After add function." << endl; + cout << "x is stored at address " << &x << " with value " << x << endl; + cout << "y is stored at address " << &y << " with value " << y << endl; + cout << "z is stored at address " << &z << " with value " << z << endl; + + return 0; +} +``` +Outputs: +``` +Before add function. +x is stored at address 0x7ffee8b8b32c with value 12 +y is stored at address 0x7ffee8b8b330 with value 9 +In add function, before adding. +a is stored at address 0x7ffee8b8b30c with value 12 +b is stored at address 0x7ffee8b8b308 with value 9 +In add function, after adding. +a is stored at address 0x7ffee8b8b30c with value 21 +b is stored at address 0x7ffee8b8b308 with value 9 +After add function. +x is stored at address 0x7ffee8b8b32c with value 12 +y is stored at address 0x7ffee8b8b330 with value 9 +z is stored at address 0x7ffee8b8b334 with value 21 +``` +- Note that the variables `x` and `y` are passed to the `add` function to serve as arguments `a` and `b` respectively. +- In the function we can see that `a` and `b` are variables which are stored a separate memory locations from `x` and `y`. When we modify `a = a + b` the value stored for `a` changes, but `x` and `y` do not because their memory locations were untouched. ## Pass by Reference Pass by Reference means that we tell the function where the original variable has been stored in memory, and we allow the function to work directly with that original variable. This has two major consequences: -- We only pass a memory address -- 4 bytes, the same size as an integer -- to the function, so there is no additional memory allocated to copy the object. +- We only pass a memory address -- usually 8 bytes in a 64-bit system -- to the function, so there is no additional memory allocated to copy the object. - The original variable can be changed by the function, and so any changes that happen within the function are retained after we leave the function's scope. -We indicate that we want a reference to a variable using the `&` operator. The function below will take a reference to an integer and increment that integer by one. Because we have changed the value stored at that memory location, once we leave this function the variable that we passed to it will retain this increased value. +We indicate that we want a reference to a variable using the `&` symbol after the type of the argument in the function signature. The function below will take a reference to an integer and increment that integer by one. Because we have changed the value stored at that memory location, once we leave this function the variable that we passed to it will retain this increased value. ```cpp void increment(int &x) @@ -35,12 +132,77 @@ void increment(int &x) x = x + 1; } ``` +- Even though we have passed `x` by reference, `x` is just an ordinary `int` variable; this is only telling the compiler to give the function access to the original memory address instead of copying it to a new one. So inside the function we can just use `x` as normal. + - A reference just means an alias (i.e. a new name) for an existing variable, and therefore has the same type as the existing variable. +- It can be a bit confusing that the notation for a reference is the same as the address operator (`&`). These two uses of `&` come up in different contexts though: + - When it is used in a type context, i.e. follows a typename, it means a _reference to a variable of that type_. + - When it is used in an expression which evaluates to a value, e.g. `cout << &var << endl`, then it is the address operator and it evaluates to the memory address of the variable to which it is affixed. + + +**Passing by reference can save significant time and memory by avoiding making needless copies of variables**, but at the cost of making variables potentially vulnerable to being changed by a function. This can make it harder for someone using the function to reason about the program, and what the value of the variables they pass in will be once the function has finished. + +We can once again illustrate this explicitly with addresses by re-using our example above, but changing the `add` function to take `a` and `b` by reference (`int &a` and `int &b`): +```cpp +#include + +using namespace std; + +int add(int &a, int &b) +{ + cout << "In add function, before adding." << endl; + cout << "a is stored at address " << &a << " with value " << a << endl; + cout << "b is stored at address " << &b << " with value " << b << endl; -Passing by reference can save significant time and memory by avoiding making needless copies of variables, but at the cost of making variables potentially vulnerable to being changed by a function. This makes it harder for someone using the function to reason about the program, and what the value of the variables they pass in will be once the function has finished. + a = (a + b); + + cout << "In add function, after adding." << endl; + cout << "a is stored at address " << &a << " with value " << a << endl; + cout << "b is stored at address " << &b << " with value " << b << endl; + + return a; +} + +int main() +{ + int x = 12; + int y = 9; + + cout << "Before add function." << endl; + cout << "x is stored at address " << &x << " with value " << x << endl; + cout << "y is stored at address " << &y << " with value " << y << endl; + + int z = add(x, y); + + cout << "After add function." << endl; + cout << "x is stored at address " << &x << " with value " << x << endl; + cout << "y is stored at address " << &y << " with value " << y << endl; + cout << "z is stored at address " << &z << " with value " << z << endl; + + return 0; +} +``` +Yielding: +``` +Before add function. +x is stored at address 0x7ffdd308656c with value 12 +y is stored at address 0x7ffdd3086570 with value 9 +In add function, before adding. +a is stored at address 0x7ffdd308656c with value 12 +b is stored at address 0x7ffdd3086570 with value 9 +In add function, after adding. +a is stored at address 0x7ffdd308656c with value 21 +b is stored at address 0x7ffdd3086570 with value 9 +After add function. +x is stored at address 0x7ffdd308656c with value 21 +y is stored at address 0x7ffdd3086570 with value 9 +z is stored at address 0x7ffdd3086574 with value 21 +``` +- Note that now `a` has the same address as `x`, and `b` has the same address as `y`. +- When `a` is updated, the value at its memory location is changed, so after the `add` function call we can see that the value of `x` has changed as well. ## Using `const` in Pass By Reference -We can retain the performance advantages of pass by reference and still protect our variables from changes by passing a const reference. +We can retain the performance advantages of pass by reference and **still protect our variables from changes** by passing a `const` reference. ```cpp void constRefExample(int const &x) @@ -49,18 +211,85 @@ void constRefExample(int const &x) } ``` -The declaration `int const &x` means that `x` is a reference (`&`) to a constant (`const`) integer (`int`). This means that the integer value cannot be changed, and so any attempt to change the value of `x` in the function will lead to a compiler error. +The declaration `int const &x` means that `x` is a reference (`&`) to a constant (`const`) integer (`int`). This means that the integer value cannot be changed within this function, and so any attempt to change the value of `x` in the function will lead to a compiler error. -## Using references for output variables +Try writing a function where you pass an argument by const reference and try to modify it inside the function. Take note of what the compiler error looks like! -When we use a `return` statement in a function, we are also passing by value, and a copy of the variable that we are returning is made. Just like with inputs to a function, this can be a performance issue if the data that we want to output is large. There are however a few things to note about the efficiency of `return` statements: +## Return Values -- Objects are copied using their _copy constructor_, a special function in their class definition which defines how to create a new object and copy the current object's data. (This may be automatically created by the compiler.) -- Some objects also have a _move constructor_ defined, in which the data is not explicitly copied, but a new object takes control of the data. We'll return to this idea when we talk about pointers in the next section! (The move constructor may also be automatically created by the compiler.) -- For such classes an object can be returned without making a copy, since the compiler knows that the original object will immediately go out of scope (and then be destroyed) and can therefore have its data moved without any conflict over which object owns the data. (This is why this optimisation can be used when returning a value but _not_ when passing an object to a function by value: when passing an object to a function the original object will continue to exist.) -- **The compiler will use a move constructor whenever available, and a copy constructor when not.** Therefore, you may find that returning values is more performant than you expect from the size of the data-structure. +When we use a `return` statement in a function, we are also passing by value, although a copy of the variable is not necessarily always made. As with inputs to a function, this can be a performance issue if large output data ends up being copied. There are however a few things to note about the efficiency of `return` statements: -If a return statement has significant overheads, it may be avoided using references. Let's assume we have a large data class `ImmovableData` with no move constructor. +- Objects are copied using their _copy constructor_, a special function in their class definition which defines how to create a new object and copy the current object's data. (In many cases this can be automatically created by the compiler.) +- Some objects also have a _move constructor_ defined, in which the data is not explicitly copied, but a new object takes control of the data. We'll return to this idea when we talk about pointers later in the course. (The move constructor may also be automatically created by the compiler.) +- Normally when an variable goes out of scope its memory is freed and can be reallocated to new variables. If we have a _local variable_ in the function scope that we want to return, we can't just give the address of the data (return by reference) because when the function returns the variable will go out of scope and that memory is freed. + - Although return types can be references, e.g. `int& someFunction()`, you have to be absolutely certain that the memory you are referencing will remain in scope. This could be e.g. a global variable, or a member of a class for an object which continues to exist. It should _never_ be a variable created locally in that function scope. Don't use reference return types unless you are really confident that you know what you are doing! +- For classes with a move constructor a local object can be returned without making a copy, since the compiler knows that the object is about to be destroyed as soon as the function returns, and can therefore have its data transferred instead. (This is why this optimisation can be used when returning a value but _not_ when passing an object to a function by value: when passing an object to a function the original object will continue to exist.) +- **The compiler will use a move constructor when available if the object is deemed large enough for the move to be more efficient than a copy, and a copy constructor when not.** Therefore, you may find that returning values is more performant than you expect from the size of the data-structure. + +We can see this move or copy behaviour for return values in the following code example: +```cpp +#include + +using namespace std; + +class Obj +{ + public: + int a, b, c, d, e, f, g, h; + + Obj(int a, int b, int c, int d, int e, int f, int g, int h) : a(a), b(b), c(c), d(d), e(e), f(f), g(g), h(h) {} +}; + +Obj makeObj() +{ + Obj myObj(1, 2, 3, 4, 5, 6, 7, 8); + cout << "In makeObj, myObj is at address " << &myObj << endl; + cout << "myObj.a is at " << &myObj.a << endl; + cout << "myObj.b is at " << &myObj.b << endl; + return myObj; +} + +int makeInt() +{ + int x = 5; + cout << "In makeInt, x is at address " << &x << endl; + return x; +} + +int main() +{ + Obj newObj = makeObj(); + cout << "Outside the function, newObj is at address " << &newObj << endl; + cout << "newObj.a is at " << &newObj.a << endl; + cout << "newObj.b is at " << &newObj.b << endl; + + int y = makeInt(); + cout << "Outside the function, y is at address " << &y << endl; + + return 0; +} +``` +Which yields the output: +``` +In makeObj, myObj is at address 0x7ffdb4b6e0c0 +myObj.a is at 0x7ffdb4b6e0c0 +myObj.b is at 0x7ffdb4b6e0c4 +Outside the function, newObj is at address 0x7ffdb4b6e0c0 +newObj.a is at 0x7ffdb4b6e0c0 +newObj.b is at 0x7ffdb4b6e0c4 +In makeInt, x is at address 0x7ffdb4b6e094 +Outside the function, y is at address 0x7ffdb4b6e0bc +``` +- `Obj` is a large data type which contains 8 `int` values. (You'll see how to define these custom data types, called classes, in section 2.) +- In `Obj` there is no explicit copy or move constructor, these are implicitly filled in by the compiler for simple types like this. +- Because `Obj` is large, the `makeObj` function returns the object using **move**. We can see that `myObj` and `newObj` have the same address. The control of this memory is moved from `myObj` to `newObj`, so when `myObj` goes out of scope and is destroyed the memory remains active and under the control of the new variable. +- `makeInt` is identical in structure, but only returns a single `int`. There's no move defined for `int` because it is already so small; we can see that `x` and `y` have different addresses. Some small objects will also be copied instead of moved. + +**Note that not all types can be moved, and not all types can be copied.** In these cases, we can use references arguments as outputs. + +## Mutable References as Outputs + +If a return statement has significant overheads, it may be avoided using references. Let's assume we have a large data class `ImmovableData` with no move constructor. ```cpp ImmoveableData GenerateData(const int &a) @@ -85,7 +314,7 @@ int main() - This code will create a large data-structure the function call, and then copy that structure when the function returns and place the result in the variable `data`. The original data-structure is then deleted. -Instead of declaring a variable and setting it equal to the return value of a function, we can instead declare the variable and then pass it into the function by reference. +Instead of declaring a variable and setting it equal to the return value of a function, we can instead declare the variable in the calling scope, and then pass it into the function by reference. ```cpp void GenerateInPlace(const int &a, ImmovableData &v) @@ -112,8 +341,11 @@ int main() - Passing small types like `int` or `float` by value is fine, as they are the same size as a reference. - Passing by value is also fine if you need a copy of the argument to work on in the function body without affecting its value outside the function. -- Pass larger arguments (> 4 bytes) by `const` reference if you can. +- Pass larger arguments (> 8 bytes) by `const` reference if you can. - Pass by (non const) reference if you need to work on a variable in place i.e. the function should change the value of the argument itself. -- Avoid `return` with large data-structures for the same reason. These should be passed in and out by reference as function arguments. -- **N.B.** If passing by reference, you can only pass literals (values like numbers and strings which are not assigned to a variable) if using a `const` reference. Consider two function signatures `refAdd(int &a, int &b)` and `constRefAdd(const int &a, const int &b)`: we can call `constRefAdd(5, 12)` just fine, but if we call `refAdd(5, 12)` we will get an error. -- **N.B.** Never use a `return` statement to return a reference (or a pointer) to a local variable e.g. `return &x;` as the local variable will be destroyed when we leave the function scope. This will lead to a segmentation fault (memory error). +- Avoid `return` with large _immovable_ data-structures for the same reason. These should be passed in and out by reference as function arguments. + +**Further things worth noting**: +- If passing by reference, you can only pass literals (values like numbers and strings which are not assigned to a variable) if using a `const` reference. Consider two function signatures `refAdd(int &a, int &b)` and `constRefAdd(const int &a, const int &b)`: we can call `constRefAdd(5, 12)` just fine, but if we call `refAdd(5, 12)` we will get an error. +- Never use a `return` statement to return a reference (or a pointer) to a local variable e.g. `return &x;` as the local variable will be destroyed when we leave the function scope. This will lead to a segmentation fault (memory error). +- You can return by reference a variable which is not local to that function's scope, for example a member function in a class may return a member variable of that class by reference, since when the function ends the object and its data will still exist. However, you must be sure that you will not keep the reference to the data for longer than the object's lifetime; if the object passes out of scope and you continue to try to use the reference then you will have a memory error. diff --git a/02cpp1/sec03ObjectOrientedProgramming.md b/02cpp1/sec03ObjectOrientedProgramming.md new file mode 100644 index 000000000..bc2925de8 --- /dev/null +++ b/02cpp1/sec03ObjectOrientedProgramming.md @@ -0,0 +1,320 @@ +--- +title: Object Oriented Programming +--- + +Estimated Reading Time: 60 minutes + +# Custom Types and Object Oriented Programming (OOP) in C++ + +As a programming language, C++ supports multiple styles of programming, but it is generally known for _object oriented programming_, often abbreviated as _OOP_. This is handled in C++, as in many languages, through the use of classes: special datastructures which have both member data (variables that each object of that class contains and which are usually different for each object) and member functions, which are functions which can be called through an object and which have access to both the arguments passed to it _and_ the member variables of that object. + +We have already been making extensive use of classes when working with C++. Indeed, it is difficult not to! The addition of classes was the main paradigm shift between C, a procedural programming language with no native support for OOP, and C++. + +## Classes + +Classes can be used to define our own data-structures, which have their own type. We can then declare objects of this type in our program. Apart from a handful of built in types (like `int`, `double`, and `bool`), variables that we declare in C++ are instances of a class. A number of objects that we've used so far are classes defined in the standard library, like `vector` and `string`. + +Classes achieve two goals in representing concepts in programming: + +- _Abstraction_ + - Represents the essential elements of a _kind_ of object, as distinct from other kinds of objects. What are the defining properties of a type of object? + - Class defines the blueprint for every object of that kind: what information it contains and what it should be able to do. + - Objects are individual instances of a class. + - _“An abstraction denotes the essential characteristics of an object that distinguish it from all other kinds of objects and thus provide crisply defined conceptual boundaries, relative to the perspective of the viewer.”_ - Grady Booch +- _Encapsulation_ + - Methods and data that belong together and kept together. + - Provide public interface to class: how other things should be able to interact with it. + - Protects and hides data to which other things should not have access. + +## Access Specifiers in Classes + +When writing a class we can declare a member function or variable using one of three access specifiers: + +- `private`: access is private by default. The variable or function is available only within the body of this class. +- `protected`: The variable or function can be accessed within the body of this class, or within the body of any class which inherits from this class. +- `public`: The variable or function can accessed inside and outside of the definition of the class, by anything which can access the object. + +The access specifiers, `private`, `protected`, and `static`, are keywords which are used within class definitions followed by a colon (`:`) to specify access for all following members until the end of the class or another access specifier is reached. For example: + +```cpp +class myClass +{ + public: + int x; + double y; + + private: + std::string name; + + protected: + double z; +}; +``` + +- `x` and `y` are both public +- `name` is private +- `z` is protected + +If you are writing classes in C++, especially classes that will be used by other people, it's a good idea to only give people access to as much as they need and no more than that. In general: + +- Make functions and variables `private` if you can. +- You can control access to variables in a finer grained way through `get` and `set` methods than by making them public. For example you may want variables that can be inspected (write a `get` function) but not changed (no `set` function) or vice versa. +- Constructors and destructors should generally be `public`. + + +## Static Members + +Static member variables or functions are special members of a class. They belong to the class as a whole, and do not have individual values or implementations for each instance. This can be useful when keeping track of properties that are changeable and may affect the class as a whole, or for keeping track of information about a class. For example, one can use a static variable to count the number of instances of a class which exist using the following: + +```cpp +class countedClass +{ + public: + + countedClass() + { + count += 1; + } + + ~countedClass() + { + count -= 1; + } + + static int count; +}; + +int countedClass::count = 0; + +int main() +{ + auto c1 = countedClass(); + cout << countedClass::count << endl; + + auto c2 = countedClass(); + cout << c2.count << endl; + + return 0; +} + +``` +- The count is incremented in the constructor (`countedClass()`), and so increased every time an instance of this type is created. +- The count is decremented in the destructor (`~countedClass()`), and so decreased every time an instance of this type is destroyed. +- `count` is a static variable, so belongs to the class as a whole. There is one variable `count` for the whole class, regardless of how many instances there are. The class still accesses it as a normal member variable. +- `count` also needs to be declared outside of the class definition. (This is where you should initialise the value.) +- A static variable can be accessed in two different ways: through the object (`c1.count`), or through the class namespace (`countedClass::count`) without reference ot any object. Public static variables for a class can therefore be accessed by anything which has access to the class definition, regardless of whethere there are any objects of that class. + +## Improving this class with Access Specifiers + +- A variable like `count` shouldn't be able to be changed outside of the class, as that could interfere with our counting! But we do want to be able to access the _value_ of the count, so we can tell how many there are. +- We should make `count` _private_ and make a function to retrieve the value _public_ +- Such functions are often called "getters", because they are frequently named `get...()` for some variable + +```cpp +class countedClass +{ + public: + + countedClass() + { + count += 1; + } + + ~countedClass() + { + count -= 1; + } + + static int getCount() + { + return count; + } + + private: + static int count; +}; + +int countedClass::count = 0; + +int main() +{ + auto c1 = countedClass(); + cout << countedClass::getCount() << endl; + + auto c2 = countedClass(); + cout << c2.getCount() << endl; + + return 0; +} +``` + +- `getCount()` is `public` and `static` and so can be accessed just like we accessed `count` before (through an object or through the class definition). +- `getCount()` returns an integer _by value_, so it returns a copy of `count`. We can't modify `count` through this function or the value we get back from it. +- `count` is now private, so if we try to access this directly from outside the class the compiler will raise an error. + +## Using Objects for Data Integrity + +An extremely useful aspect of defining a new type via a class is the ability to provide guarantees that any object of that type satisfies certain properties. These properties allow programmers to write programs that are more efficient and correct with less overhead for error checking. + +Let's explore this with some examples. + +### Ensuring Objects Are Self-Consistent + +Let's suppose we have a physical simulation, which involves a ball suspended in some fluid. A ball will probably have the following fields: +```cpp +class Ball +{ + std::array position; + double radius; + double mass; +} +``` + +These fields define the sphere well, but physically the behaviour of the sphere in the fluid will depend on its _density_. So perhaps we want to write a member function `double density(double radius, double mass)` which calculates the density of the sphere. But this would mean we need to call the density function and re-calculate it when we want to use it, which isn't ideal. So instead, we can add density to our list of fields, +```cpp +class Ball +{ + public: + std::array position; + double radius; + double mass; + double density; +} +``` + +and then we can call the density directly without another calculation. The problem that we now have is that in order for our data to be self-consistent, **a relationship between the radius, mass, and density must be satisfied**. + +We could approach this problem by calculating the density in the constructor, and making the radius, mass, and density **private**. This means that external code can't change any of these values, and therefore they can't become inconsistent with one another. But we still need to be able to _read_ these variables for our physics simulation, so we'll need to write **getter** functions for them: +```cpp +class Ball +{ + public: + Ball(std::array p, double r, double m): position(p), radius(r), mass(m) + { + setDensity(); + } + + std::array position; + double getRadius(){return radius;} + double getMass(){return mass;} + double getDensity(){return density;} + + private: + void setDensity() + { + density = 3 * mass / (4 * M_PI * pow(radius, 3)); + } + double radius; + double mass; + double density; +} +``` + +Now we can even make our code **more flexible without sacrificing safety**. Let's say the ball can change _mass_ or _radius_. We can't just make these variables public and change them independently, because then the _density_ will no longer be consistent with the new mass / radius. We need to add **setter** functions which **maintain the integrity of the object**: +```cpp +class Ball +{ + public: + Ball(std::array p, double r, double m): position(p), radius(r), mass(m) + { + setDensity(); + } + + std::array position; + double getRadius(){return radius;} + double getMass(){return mass;} + double getDensity(){return density;} + + double setRadius(double r) + { + radius = r; + setDensity(); + } + + double setMass(double m) + { + mass = m; + setDensity(); + } + + private: + void setDensity() + { + density = 3 * mass / (4 * M_PI * pow(radius, 3)); + } + + double radius; + double mass; + double density; +} +``` +We now have a ball class that can be instantiated with any mass and radius, and can have its mass or radius changed, but **always satisfies the property that the density field is correct for the given radius and mass of the object**. Being able to guarantee properties of objects of a given type makes the type system far more powerful and gives users the opportunity to use objects in more efficient ways without having to check for conditions that are already guaranteed by the object's design. + +### Maintaining Desirable Properties + +Consider another example where we have a catalogue for a library. To keep things simple, we'll say that we just store the title of each book. Very simply, we could define this as a vector: +```cpp +vector catalogue; +``` +and every time we want to add a new title we can simply stick it on the end of the list: +```cpp +catalogue.push_back("Of Mice and Men"); +``` +Adding books to our catalogue is certainly very simple! But what happens when we want to _look up_ a book, to see if it's in the catalogue? + +In an unordered list, the only thing we can do is go through the entire list one by one until we find it or reach the end of the list. The amount of time that we take searching will be proportional to the length of our catalogue, which isn't great performance. + +This is particularly bad because we'd expect people to look up books far more often than we add new ones! How can we do better? + +If our list were _sorted_, then we can search much more quickly using a [binary search](https://en.wikipedia.org/wiki/Binary_search_algorithm). A binary search on a sorted list starts by looking at the element in the middle of the list and checks if the item we're looking for should come before or after that. We then only need to search the half of the list that would contain the book we're looking for. We then apply the same thing again to narrow the list down by half again, and so on. At every step we half the size of the list and therefore the number of titles we have to check is proportional to _the logarithm of the size of the list_. This is much, much better performance, especially if the size of the list is large. A binary search with 21 comparisons could search a list of over a million books! + +Of course, we don't want to sort our data before searching it every time (that would be even more wasteful than our linear search), and we want to know with certainty that our list is always sorted, otherwise our binary search could fail. Using an object is a solution: we can define a wrapper class which keeps the list private, and provides an insertion method which guarantees that new entries are inserted into their proper place. Then **we can take advantage of speedier lookup because we know that our catalogue is always in sorted order**. (Incidentally, this would normally be done with a _balanced binary search tree_, an example of which is the C++ `map` type.) + +## Aside: Organising Class Code in Headers and Source Files + +As we saw last week, C++ code benefits from a separation of function declarations (in header files) and implementations (in source files) when these functions need to be included in other files. A similar principle applies to classes. + +In the header file, we should declare the class as well as: +1. What all of its member variables are +2. Function declarations for all of its member functions +3. Can also include full definitions for trivial functions such as getter/setter functions + +For example: +**In `ball.h`:** +```cpp +class Ball +{ + public: + Ball(std::array p, double r, double m); + + std::array position; + double getRadius(){return radius;} + double getMass(){return mass;} + double getDensity(){return density;} + + private: + void setDensity(); + + double radius; + double mass; + double density; +} +``` +**In `ball.cpp`:** +```cpp +// constructor definition +// Ball:: tells us that the function Ball(...) is part of the Ball class +Ball::Ball(std::array p, double r, double m): position(p), radius(r), mass(m) +{ + setDensity(); +} + +// Again, Ball:: tells us that this function is part of the Ball class definition +// Because this is a member function, it has access to all the data members of this class. +Ball::setDensity() +{ + density = 3 * mass / (4 * M_PI * pow(radius, 3)); +} +``` + +We must include declarations for all member functions and variables in the class because any code which makes use of the class needs to know the full interface. It's also very important for C++ compilers to know what data a class needs to hold in order to know how much memory to reserve when constructing it. Because object files can be compiled separately, the information about data members must be in the header. diff --git a/02cpp1/sec01StandardLibrary.md b/02cpp1/sec04StandardLibrary.md similarity index 88% rename from 02cpp1/sec01StandardLibrary.md rename to 02cpp1/sec04StandardLibrary.md index c301577eb..1f56ef374 100755 --- a/02cpp1/sec01StandardLibrary.md +++ b/02cpp1/sec04StandardLibrary.md @@ -3,38 +3,12 @@ title: C++ Standard Library --- Estimated Reading Time: 1 hour - -## Preamble: C++ Types and Declaring Variables - -If you're relatively unfamiliar with C++ you will need to know that C++ is a _statically typed language_, which means that the type of each of your variables is determined at compile time and doesn't change over the course of the program execution. Typically in C++ when declaring variables we do so by first declaring the type, then the name of the variable, and then (optionally) its value. For example: - -``` cpp -int x = 5; -``` - -- This declares a variable `x` of type `int` with value `5`. - -Types in C++ can become quite complicated as we shall see, and it is sometimes easier to read and write code which makes use of the `auto` keyword. This keyword tells the C++ compiler to deduce the type for us. This is called _type inference_ and was made a feature of C++ in C++11, so will be absent in older codes. - -``` cpp -auto x = 5; -auto y = std::abs(-13.4); -``` - -- `auto` can usually deduce the type from the result of an expression by looking at the types of all the variables and functions within it. - - For example here it interprets `5` as an integer and therefore deduces that `x` is an int. - It will deduce that `y` is a `double`, since `-13.4` must be a floating point type (`double` by default) and `std::abs` returns a `double` when given a `double`. - - Be especially careful when values can be ambiguous. Here `5` is being assigned as an int, but `5` is also a valid `float` or `double` - **if you want a specific type in cases like this you should always specify it explicitly**. -- `auto` doesn't always work: the compiler must be able to deduce the type from contextual information. - - You cannot declare an unitialised variable with `auto` e.g. `auto z;` will lead to a compiler error as it won't know what type is intended for `z`, even if `z` is later assigned. -- You cannot use `auto` when declaring the return types or parameter types of functions, you must always declare these type explicitly. - - It's generally a good idea therefore to know what the types of variables in your code are, even if you choose to use the `auto` keyword! This will make writing your own functions, and knowing what functions you can pass your data to, much easier. -- In an IDE like VSCode you can inspect the type of a variable by hovering the mouse over that variable. If you've used `auto` it will be able to tell you what type has been deduced for it. - + ## C++ Standard Library The C++ standard library is a collection of data-structures and methods which must be provided with any standard-compliant implementation of C++. As a result, using the standard library is portable across different systems and compilers, and does not require downloading and linking additional libraries (which will be a topic we cover in a later week). It does, however, require the use of header files, as we'll see in a moment. -The C++ language standard is always evolving, with the most recent version of the standard being C++20 (released 2020), and the next planned to come some time this year (C++23). Sometimes there may be requirements to work with a specific C++ standard (more often, a minimum standard), and so it can be important to check whether a feature that you want to use is available to you in the version of C++ that you will be using. +The C++ language standard is always evolving, with the most recent version of the standard being C++23. Sometimes there may be requirements to work with a specific C++ standard (more often, a minimum standard), and so it can be important to check whether a feature that you want to use is available to you in the version of C++ that you will be using. In this course the language features that we will make use of will be compatible with C++14 onwards, though we will use C++17 as our standard. This week we will go over some of the most commonly used components of the standard library, but this will only scratch the surface and becoming familiar with the breadth of the library takes time and practice. @@ -88,7 +62,18 @@ int main() ## Containers -Containers are an important part of the C++ standard library; these allow us to keep collections of objects such as lists (`vector`, `array`), sets (`set`), or maps of key-value pairs (`map`, `unordered_map`), among others. These are some of the most common classes that you will use in C++ programming, so it is a good idea to familiarise yourself with them. We'll discuss `vector` as an example here, but see the section "Using C++ Documentation" for more information on how to learn about the other kinds of containers. +Containers are an important part of the C++ standard library; these allow us to keep collections of objects such as lists (`vector`, `array`), sets (`set`), or maps of key-value pairs (`map`, `unordered_map`), among others. These are some of the most common classes that you will use in C++ programming, so it is a good idea to familiarise yourself with them. We'll discuss `vector` as an example here as it probably the most commonly used, but see the section "Using C++ Documentation" for more information on how to learn about the other kinds of containers. + +**At minimum, you should be comfortable using `vector`, `array`, and `map`.** +- `vector` is used to hold list of elements of type `T` of dynamic size (where `T` is some arbitrary type). + - It requires `#include `. +- `array` is used to hold a list of elements of type `T` of fixed size. The size must be determined **at compile time**. + - It requires `#include ` + - Since the size is known at compile time, it can provide a guarantee that array sizes are always compatible with what you expect during runtime. This can be useful for ensuring that lists of data match up in size, or that a vector has the same dimensions as the vector space it is supposed to be in. + - Arrays can be more efficient than vectors because you don't need to do manual size checking, and the compiler may be able to make optimisations if the size is known at compile time. +- `map` is used to hold key-value pairs with keys of type `T1` and values of type `T2`. (This is known in some languages as a Dictionary.) Maps are dynamic in size; key-value pairs can be added and removed. You can check if a given key is present in a `map`, and you can also iterate over all elements in a map and access both the key and the value. + - It requires `#include `. +- The fact that all three of these types are _iterable_ means that we can apply some of the same programming techniques to all of them, because we can loop over their elements. ## Vector @@ -365,7 +350,7 @@ We can see from our previous example the use of the `()` and `{}` brackets to de You will often find when programming, especially in a language with such an expansive standard library, that there are things that you need to look up. There are a large number of classes and functions available to C++ programmers, many of which may be new to you or require refreshing at various points. -Two common sites for C++ refernce are: +Two common sites for C++ reference are: - - diff --git a/02cpp1/sec03Pointers.md b/02cpp1/sec05Pointers.md similarity index 88% rename from 02cpp1/sec03Pointers.md rename to 02cpp1/sec05Pointers.md index 903a03999..d43fdc99e 100644 --- a/02cpp1/sec03Pointers.md +++ b/02cpp1/sec05Pointers.md @@ -10,35 +10,38 @@ You'll already have used references to refer to objects without copying them. Th - References cannot be reassigned to refer to a new place in memory, they can only be assigned at initialisation. - You cannot have references of references. -- You cannot create circular references e.g. an object A which has a reference to object B, which in turn has a reference to object A. This would require reassignment of references to construct. - References cannot be null. +- You cannot create circular references e.g. an object `obj1` which has a reference to object `obj2`, which in turn has a reference to object `obj1`. This would require reassignment of references to construct. +- Similarly you cannot have a class `A` which has a member of reference type `&B`, and a class `B` which has a reference of type `&A`. - You cannot store references in container types like `vector`, `array`, `set`, or `map`. -In these cases, we use *pointers*. A pointer is variable which stores an address in memory where an object's data is located (we say that it "points to" this object), or the special value `nullptr`. Pointers give us much more flexibility than references, especially when writing classes for objects that need to point to other data (either of the same class, like in graph representation where nodes point to other nodes, or of another class). In modern C++ (since C++11) we usually declare a pointer using a *smart pointer*, of which there are three different kinds: unique pointers, shared pointers, and weak pointers. +In these cases, we use *pointers*. **A pointer is variable which represents an address in memory** where an object's data is located (we say that it "points to" this object), or the special value `nullptr`. Pointers give us much more flexibility than references, especially when writing classes for objects that need to point to other data (either of the same class, like in graph representation where nodes point to other nodes, or of another class). In modern C++ (since C++11) we usually declare a pointer using a *smart pointer*, of which there are three different kinds: unique pointers, shared pointers, and weak pointers. + +Throughout the following sections, and whenever you are working with pointers, it is useful to bear in mind that the value of a pointer is a memory address. We can get the data at that memory address by _dereferencing_ the pointer, but that is not the same as the pointer's value. ## Optional Background: The Stack, the Heap, and Variable Scope We will go into more detail on memory structures later on in the course when we discuss performance programming. It can however be easier to understand the usage of pointers in C++ if we understand the difference between two different kinds of memory: the _stack_ and the _heap_. -- **Stack** memory is used to store all local variables, as well as the "call stack". The call stack contains information about the currently active functions, including the value of variables in each scope, and allows us to continue execution at the correct place in the program when we leave a function. Stack memory is faster than heap memory, but limited in size. The amount of stack memory available is not known to the program at compile time, as stack memory is reserved for the program at runtime by the operating system. Using too much stack memory causes a _stack overflow_ error, which will cause your program to crash. When variables on the stack go out of scope then their destructor is called and their memory is freed. -- **Heap** memory is somewhat slower, but can make use of as much RAM as you have available, so large datasets tend to be declared on the heap. (Heap memory is still faster than reading/writing to hard disk.) Any memory allocated on the heap _must_ be pointed to by something on the stack, otherwise it will be inaccessible to us. +- **Stack** memory is used to store all local variables and relevant information for any function calls which have yet to complete, and allows us to continue execution at the correct place in the program when we leave a function. Stack memory is faster than heap memory, but limited in size. The amount of stack memory available is not known to the program at compile time, as stack memory is reserved for the program at runtime by the operating system, but a few megabytes is quite typical. Using too much stack memory causes a _stack overflow_ error, which will cause your program to crash. When variables on the stack go out of scope then their destructor is called and that part of the stack memory is freed and can be overwritten again. +- **Heap** memory is somewhat slower, but can make use of as much RAM as you have available, so large datasets tend to be declared on the heap. (Heap memory is still faster than reading/writing to hard disk.) Any memory allocated on the heap _must_ be pointed to by something on the stack, otherwise it will be inaccessible to us. Heap memory must be freed by a stack object's destructor, or manually. Data will end up on the stack or the heap depending on how it is declared, and the internal structure of the class itself. -- When you declare a variable, then it is stored on the stack e.g. `int x = 5;` will store an integer on the stack. Declaring any kind of variable this way stores that object on the stack. +- When you declare a local variable, then it is stored on the stack e.g. `int x = 5;` will store an integer on the stack. Declaring any kind of variable this way stores that object on the stack. - Initialising a variable by declaring a pointer using e.g. `make_unique`, `make_shared`, or `new` will allocate memory on the heap. Note however that the variable which has been declared - the pointer - is on the _stack_, and the memory that it is pointing to is on the _heap_. - Many objects which are not simple types will also declare memory on the heap: `vector<>` is an example of such a class. We can use the code `vector v = {1,2,3,4,5};` to declare a vector `v` on the stack. The vector itself is on the stack, and is deleted when `v` goes out of scope. The data stored in the vector - in this case, five integers - is not stored on the stack, but is actually stored on the heap. The vector `v` will contain a pointer to this heap memory, and uses this to retrieve your data when you call for it. The vector class will automatically free the memory it allocates on the heap when its destructor is called, so you don't have to do that yourself. - Pointers do not have to point to heap memory, they can also point to stack memory if initialised with a reference to a stack variable, e.g. `int * x_ptr = &x`. In general, you should think carefully about whether you want this behaviour; as we shall see later this can lead to memory problems if not handled carefully! ## What Are Smart Pointers? -Smart pointers are a special kind of pointer, introduced in C++11. Since then, they are typically used as the default pointers for most applications, as they automatically handle some memory management which would previously have to be done manually. The reason we have three different kinds of smart pointers is because they embody three different possible ideas about *memory ownership*. Understanding ownership is key to understanding the useage of smart pointers. +Smart pointers are a special kind of pointer, introduced in C++11. Since then, they are typically used as the default pointers for most applications, as they automatically handle some memory management which would previously have to be done manually. The reason we have three different kinds of smart pointers is because they embody three different possible ideas about *memory ownership*. Understanding ownership is key to understanding the usage of smart pointers. When we talk about ownership of some memory or data, the question we are asking is what should have control over the lifetime of the data i.e. when the data should be allocated and freed. Smart pointers in C++ address three cases: -- Unique Ownership: The lifetime of the data should be determined by the lifetime of a single variable. This is essentially how we treat stack variables: when the variable goes out of scope, the destructor is called and the memory is freed. **Unique Pointers** offer a similar model for memory that is allocated on the heap. -- Shared Ownership: The lifetime of the data is determined by multiple other variables, and the data should remain as long as one of those variables is still in scope. This is represented using **Shared Pointers**. Once all shared pointers pointing to a particular piece of data go out of scope, then the memory for that data is freed. -- Non-Owning: Non-owning pointers should have no impact on the lifetime of the data. When the non-owning pointer goes out of scope nothing happens to the data that it was pointing to. There are represented using **Weak Pointers** or traditional raw pointers. +- Unique Ownership: The lifetime of the data should be determined by the lifetime of a single variable. This is essentially how we treat stack variables: when the variable goes out of scope, the destructor is called and the memory is freed. **Unique Pointers** offer a similar model for memory that is allocated on the heap. When a non-null unique pointer goes out of scope, as well as being freed on the stack, its destructor frees the heap memory to which it points. +- Shared Ownership: The lifetime of the data is determined by multiple other variables, and the data should remain as long as one of those variables is still in scope. This is represented using **Shared Pointers**. Only once all shared pointers pointing to a particular piece of data go out of scope, then the memory for that data is freed. +- Non-Owning: Non-owning pointers should have no impact on the lifetime of the data. When the non-owning pointer goes out of scope they are removed from the stack but **nothing happens to the data that it was pointing to**. There are represented using **Weak Pointers** or **raw pointers**. ## Unique Pointers `std::unique_ptr<>` @@ -63,7 +66,7 @@ int main() - Remember that pointers actually store memory addresses, not the values of the variables that they point to. So to get the value of the variable we need to "dereference" the pointer using the `*` operator. `p_int` refers to the smart pointer, but `*p_int` gives us the value of the integer that we are pointing to. - We can make assignments to `*p_int` which will update the value of the integer, but doesn't change the memory location (so `*p_int` will change, but `p_int` won't). -You also can't make a copy of a unique pointer, as then there would be a conflict over which one should handle the destruction of the object when it goes out of scope. This means that when we want to pass a unique pointer to a function, we cannot pass it by value, because this would involve making a copy. We can, however, pass a unique pointer by reference. +**You also can't make a copy of a unique pointer**, as then there would be a conflict over which one should handle the destruction of the object when it goes out of scope. This means that when we want to pass a unique pointer to a function, we cannot pass it by value, because this would involve making a copy. We can, however, pass a unique pointer by reference. ```cpp #include @@ -249,7 +252,7 @@ int main() Weak pointers are a special kind of smart pointer which can only point to memory owned by a shared pointer. You cannot use weak pointers to initialise new objects in memory, point to memory owned by unique pointers, or point to ordinary stack objects. Weak pointers do not contribute to the pointer count for the shared pointer, so they do not impact the lifetime of the object that they are pointing to. They can be used therefore to break the circular reference problem with shared pointers and model situations where there is no ownership relation. -Because weak pointers do not own the memory to which they point, that memory can be freed (the object deleted) before the weak pointer is out of scope, leaving the weak pointer to point at invalid memory. You can check whether a weak pointer points to valid memory using the same method as unique or shared pointers: if we have a weak pointer `wpt` then the expression `(wpt)` will evaluate to `true` if `wpt` is pointing to an object which has not been destroyed, and `false` if the object has been deleted. This is especialy important for weak pointers as this can happen even if the programmer does not invoke any `move` or `swap` operations. Although it can slow down execution, it is a good idea to check that the existence of objects pointed to by weak pointers before trying to access them. You can also use `wpt.expired()` which will return `true` if the memory is deleted and `false` otherwise (the opposite of `(wpt)`). +Because weak pointers do not own the memory to which they point, that memory can be freed (the object deleted) before the weak pointer is out of scope, leaving the weak pointer to point at invalid memory. You can check whether a weak pointer points to valid memory using `wpt.expired()`, which will return `true` if the memory is deleted and `false` otherwise. Accessing weak pointers is also different to accessing other kinds of pointers because they cannot be dereferenced directly. That means if we have a weak pointer `wpt` we can't get the value using `*wpt` or call a member function using `wpt->function()`. Instead, we must create a new shared pointer to that memory using `spt_new = wpt.lock()`, and then access the data through `spt_new`. This also creates additional overheads for accessing weak pointers. @@ -343,7 +346,7 @@ which is exactly equivalent, but the `int const` form is preferred because it is Using `const` with pointers allows us to declare one of two things (or both): - **The pointer points to a `const` type**: we declare the data pointed to constant, and so this pointer cannot be used to update the value of held in the memory location to which it points. In other words, the memory pointed to is declared read-only, and we can deference the pointer to retrieve the data at that location, but we can't update it. We can however change the memory address that the pointer points to, since the pointer itself is not constant (remember the pointer is actually a variable storing a memory address). - - To do this with a smart pointer we need to place the `const` in the angle brackets, e.g. `shared_ptr readOnlySPtr` or `shared_ptr readOnlySPtr` which declares a shared pointer to a constant int. The `const` keywork here applies to the type of the data, `int`, so it is the data pointer to, not the pointer itself, which is being declared const. + - To do this with a smart pointer we need to place the `const` in the angle brackets, e.g. `shared_ptr readOnlySPtr` or `shared_ptr readOnlySPtr` which declares a shared pointer to a constant int. The `const` keyword here applies to the type of the data, `int`, so it is the data pointer to, not the pointer itself, which is being declared const. - To do this with a raw pointer use the `const` keyword _before_ the `*` operator, e.g. `int const * readOnlyPtr` or `const int * readOnlyPtr`. This declares a (raw) pointer to a constant int. - A pointer to const data only prohibits the value in memory being changed _through that pointer_, but if the value can be changed another way (e.g. it is a stack variable or there is another pointer to it) then it could still be changed. - **The pointer itself is const**: the memory location pointed to is a constant. In this case, the value held in the memory can change, but the pointer must always point to the same place and we can't redirect the pointer to look at another place in memory. diff --git a/03cpp2/index.md b/03cpp2/index.md index 19e6c4c8a..6b7f726a0 100644 --- a/03cpp2/index.md +++ b/03cpp2/index.md @@ -1,26 +1,35 @@ --- -title: "Week 3: Modern C++ (2)" +title: "Week 3: Error Handling and C++ Projects" --- ## Week 3: Overview ### This Week -This week will focus on introducing object oriented techniques in the context of C++. We will look at how we can use classes in our programs to group data and functionality together, and how we can create relationships between classes using inheritance or composition to make more interesting and flexible objects. We'll then look at error handling in C++ using exceptions, which will also make use of our knowledge of inheritance. +This week we'll be looking at how to work with larger and more complex C++ projects. We'll cover approaches to error handling in C++, including _exceptions_, how to manage building significant projects using C++ and work with an external library, and how to test our code rigorously. -* [Introduction to Object Oriented Programming](sec01ObjectOrientedProgramming.html) - - Classes for encapsulation and abstraction - - Static members - - Access specifiers -* [Inheritance](sec02Inheritance.html) - - Inherited properties and access - - Inheritance vs composition - - Function overriding - - Polymorphism - - Virtual functions and abstract classes -* [Exceptions](sec03Exceptions.html) - - Error handling in C++ - - Throwing and catching exceptions - - Standard Exceptoin types and defining our own +* [Exceptions](./sec01Exceptions.html) + - Error handling basics + - Exception types + - Try and Catch statements + - Defining our own Exception types +* [Other Error Handling Mechanisms](./sec02ErrorHandling.html) + - Error return codes + - Optional types +* [CMake Basics](/sec03CMakeBasics.html) + - Basics of CMake files + - Project directory structure + - Setting C++ standards +* [Unit Testing](./sec04UnitTesting.html) + - Installing the Catch2 testing library + - Adding Catch2 to your CMake files + - Testing Strategy + - Writing tests +If you would like more detail on CMake and build systems, you can read the following optional notes. These are not necessary to do this week's exercises, but may be useful during the course and assignments. + +* [Build Systems](./sec05SoftwareBuilds.html) +* [CMake Background](./sec06CMakeBackground.html) +* [CMake "Hello World"](./sec07CMakeHelloWorld.html) +* [Building with CMake](./sec08BuildHelloWorld.html) diff --git a/03cpp2/sec03Exceptions.md b/03cpp2/sec01Exceptions.md similarity index 66% rename from 03cpp2/sec03Exceptions.md rename to 03cpp2/sec01Exceptions.md index a30a7324b..569eaba8e 100644 --- a/03cpp2/sec03Exceptions.md +++ b/03cpp2/sec01Exceptions.md @@ -22,26 +22,28 @@ The `` header contains the following exception types: **All of these exception classes inherit from the class `std::exception`.** Inheritance will play an important role in how we define, identify, and handle exceptions. -You may notice these exceptions being thrown by a number of different methods that you use in C++! For example, if you try to access a `vector` using `.at(i)` outside of the range of the vector, an `out_of_range` exception will be thrown, which will halt program execution and you can see it reported the terminal output if you don't handle it properly. +You may notice these exceptions being thrown by a number of different methods that you use in C++! For example, if you try to access a `vector` using `.at(i)` outside of the range of the vector, an `out_of_range` exception will be thrown, which will halt program execution and be reported in the terminal output if you don't handle it properly. -When you look up a function in the C++ documentation, you can see what exceptions it can throw, and therefore what you may need to check for if necessary. Some functions have "no throw" guarantees, which means that they cannot throw exceptions. Just because a function does not throw exceptions does not mean it is impossible for an error to occur: be sure to check if the function is using some other method of reporting and handling errors. (For example, some code which is compatible with C will use the return value of the function to report whether it was successful or encountered some kind of error.) +When you look up a function in the C++ documentation, you can see what exceptions it can throw, and therefore what kinds of errors you may need to consider checking for. Some functions have "no throw" guarantees, which means that they cannot throw exceptions. Just because a function does not throw exceptions does not mean it is impossible for an error to occur: be sure to check if the function is using some other method of reporting and handling errors. For example, some code which is written in, or compatible with, C or FORTRAN will use the return value of the function or a mutable reference parameter to report whether the function execution was successful or not. ## The basic idea of Exceptions -An exception is a type of object which can be "thrown" when something happens that is not supposed to happen, but which has been anticipated by the programmer as a possibility. For example, if you write a function which requires the use of the first three elements of a vector, your function should check that the vector passed to it has at least three elements. One way of handling the case where it does not have enough elements is to throw an exception: this halts the execution of the function (thereby preventing any attempts to, for example, access elements that aren't there), and returns control to the calling code. If the code which called the function has been set up to detect exceptions, then it can "catch" the exception and handle it appropriately. If the exception is not caught by the calling function, then that function halts as well and the exception propagates up the stack to that function's caller, and so on until it is caught. Exceptions which are thrown but not caught by any enclosing code will cause the program to terminate prematurely. +An exception is a type of object which can be "thrown" when something happens that is not supposed to happen, but which has been anticipated by the programmer as a possibility. The exception is then "caught" in another part of the code which then handles the error. Uncaught exceptions will terminate the program execution (which may be the desired effect if there is no way to continue after an error). -**N.B.** Exceptions should not be used a method of controlling program flow (like, for example, `if ... else` statements) and should only be used to cover unusual cases. **Do not use throw exceptions as an alternative way of returning a value from a function or as a way to exit a loop.** These should always be handled using `return` statements (for function return) and `break` statements (for loop exit). Exception handling statements should be used to handle cases where normal execution of the program simply cannot continue, and in a typical execution of the code the exception should not be thrown. Some good examples of where to use exceptions for error handling are: running out of space for a process that needs to allocate memory, file system input/output errors, runtime errors from users giving unexpected bad input. Using exceptions to control program flow other than error handling can be a problem for two reasons: -- You program will be difficult to read and understand, because it will look like you are doing error handling when you aren't. -- Exceptions incur quite a bit of overhead _when exceptions are thrown_, but otherwise do not usually impact performance to a large degree. If you use exceptions to handle expected program branching you will throw many exceptions and incur a serious performance cost. +For example, if you write a function which requires the use of the first three elements of a vector, your function should check that the vector passed to it has at least three elements. One way of handling the case where it does not have enough elements is to throw an exception: this halts the execution of the function (thereby preventing any attempts to, for example, access elements that aren't there), and returns control to the calling code. If the code which called the function has been set up to detect exceptions, then it can "catch" the exception and handle it appropriately. If the exception is not caught by the calling function, then that function halts as well and the exception propagates up the stack to that function's caller, and so on until it is either caught or we reach the end of the call stack and the program terminates. + +**N.B.** Exceptions should not be used a method of controlling program flow and should only be used to cover unusual cases that shouldn't normally occur (hence "_exception_"). **Do not use throw exceptions as an alternative way of returning a value from a function or as a way to exit a loop.** These should always be handled using `return` statements (for function return) and `break` statements (for loop exit). Exception handling statements should be used to handle cases where normal execution of the program simply cannot continue, and in a typical run of the code the exception should not be thrown. Some good examples of where to use exceptions for error handling are: running out of space for a process that needs to allocate memory, file system input/output errors, runtime errors from users giving unexpected bad input. Using exceptions to control program flow other than error handling can be a problem for two reasons: +- Your program will be difficult to read and understand, because it will look like you are doing error handling when you aren't. +- Exceptions incur quite a bit of overhead _when exceptions are thrown_, but otherwise do not usually impact performance to any significant degree. If you use exceptions to handle expected program branching you will throw many exceptions and incur a serious performance cost. We'll take a look now at how to do this in practice, starting with catching exceptions thrown by functions that you use. ## Catching Exceptions -When some code we are running encounters an error and throws an exception, we need a mechanism for responding to that exception. +We'll start by looking at how to handle an error thrown by an existing function, such as a range error thrown by a vector. When such a function encounters an error and _throws_ an exception, it needs to be _caught_. -- We first need to define the code that might cause the problem. We do this with the `try{...}` keyword. - - This tells our compiler that we want to monitor the execution of this code for exceptions. +- We first need to identify the code that could throw the exception. We do this with the `try{...}` keyword. + - This tells our compiler that we want to monitor the execution of this code block (inside the `{}`) for exceptions. - We then need to intercept any exceptions which are thrown and respond to them. We do this with the `catch(){...}` keyword. Let's take a look at an example of how to catch an exception thrown by a standard library function: @@ -78,16 +80,17 @@ int main() - We get the exception definitions from `#include `. - If we think a code block might throw an exception then we can place it inside a `try{ code block }` statement. - In this case we are accessing a vector, which may not be long enough. -- After a `try{}` block we need a `catch(exception_type e){}` block. Inside the curly braces of the catch block we put the code that we want to execute if there an exception is raised. This means that we don't have to halt execution when the exception is raised any more! - - Once the `catch` block has been executed, the program will continue as normal from after the `try`/`catch` blocks. (In this case, to `return 0;`) +- After a `try{}` block we need a `catch(exception_type e){ code block }` statement. Inside the curly braces of the catch statement we put the code that we want to execute if an exception of that type is raised. + - Some catch blocks could, for example, make corrections to values, adopt some kind of default setup, or simply log detailed error messaging and terminate the program. +- Once the `catch` block has been executed, the program will continue as normal from after the `try`/`catch` blocks. (In this case, to `return 0;`) **It does not go back to finish the `try` block which was interrupted, so anything inside that block that occurs after the exception is thrown will remain unexecuted.** This is an important point to bear in mind for some potential errors that we will discuss later. - We can use `e.what()` to get the exception's message, which should report some useful information about why the exception was raised. -- We can have multiple `catch` statements after one `try` block to handle different kinds of exceptions, for example one block for `std::out_of_range`, and another for `std::overflow_error`. +- We can have multiple `catch` statements after a single `try` statement to handle different kinds of exceptions which could be thrown from the same code, for example one block for `std::out_of_range`, and another for `std::overflow_error`. - All exceptions inherit from `std::exception`, which can be used from the `` header, so if you want a catch block to catch generic exceptions you can write `catch(std::exception e){}`. - If you want to catch anything that has been thrown and you don't want to access any information in the exception itself you can also use `catch(...)`, but you will usually want to name your exception variable so that you can report information from it. - `catch` clauses will be evaluated in order, so you should always list your `catch` statements from most specific to most general i.e. list _derived classes_ before the _base classes_ from which they inherit. For example, `std::out_of_range` is a sub-type of `std::exception` since the `out_of_range` class inherits from `exception`. This means that: - if `catch(std::exception e)` comes before `catch(std::out_of_range e)` then all `out_of_range` errors will be caught by the more general `exception` clause, and the specialised `out_of_range` error handling code will never run. - if `catch(std::out_of_range)` is placed first, then the `catch(std::exception e)` code will only run for exceptions which are not `out_of_range`. -- `cerr` is a special output stream for errors; we can use this if we want the error to be written to a different place than standard output (e.g. standard ouput to file and errors to terminal, or vice versa). We can also output exception information to `cout` though. +- `cerr` is a special output stream for errors; we can use this if we want the error to be written to a different place than standard output (e.g. standard output to file and errors to terminal, or vice versa). We can also output exception information to `cout` though. We can see in this example that using `try` and `catch` blocks have significant advantages for someone reading our code: @@ -117,7 +120,7 @@ double h(double x) { if(x < 0) { - throw std::logic_error("g(x) not defined for x < 0"); + throw std::logic_error("h(x) not defined for x < 0"); } return sqrt(x); } @@ -159,17 +162,20 @@ int main() return 0; } ``` +- While this case would be trivial to check for at input, for many processes it may require actually running the code to find out if an input is acceptable or not (for example if `p(x) <= 0` were not analytically tractable). +- Note that the exception is in this case not handled by the code that directly calls the function which throws: the exception is thrown by `g` or `h`, which is called by `f`; `f` has no catching code so the exception propagates up another layer in the call stack to `main`. Here we finally catch and handle it by reporting the problem and asking the user for a new input. ## Throwing Exceptions -We can also throw exceptions from our own functions, which allows code which calls our functions to handle errors that might occur within our function code. For example, we might have a function which calculates the inner product of two vectors. In this case, the length of both vectors must be the same for this to make sense! So we could write code like this: +We can also throw exceptions from our own functions, which allows code which calls our functions to handle errors that might occur within our function code. For example, we might have a function which calculates the inner product of two vectors. In this case, the number of elements in both vectors must be the same for this to make sense! So we could write code like this: ```cpp double InnerProduct(const vector &x, const vector &y) { if(x.size() != y.size()) { - std::string errorMessage = "Inner product vectors different sizes: " + std::to_string(x.size()) + " and " + std::to_string(y.size()); + std::string errorMessage = "Inner product vectors different sizes: " + + std::to_string(x.size()) + " and " + std::to_string(y.size()); throw std::range_error(errorMessage); } @@ -206,11 +212,11 @@ int main() } ``` -**Warning** You can actually throw _any_ type in C++, it doesn't have to inherit from `exception`. The compiler will not complain if you throw, for example, an `int` or a `string`. To do so is bad practice, as these objects are not designed to be used to carry error information and codes which may call your functions will not be expecting them. This means calling code is unlikely to check for them, which will allow them to pass to the top level uncaught and halt execution. Restricting yourself to throwing classes which inherit from `exception` will make your code easier to understand for others, and compatible with other code. If you need values of other types to be reported with your exception then you can include them as member variables in you own exception class (see below). +**Warning** You can actually throw _any_ type in C++, it doesn't have to inherit from `exception`. The compiler will not complain if you throw, for example, an `int` or a `string`. To do so is bad practice, as these objects are not designed to be used to carry error information and codes which may call your functions will not be expecting them. This means calling code is unlikely to check for them, which will allow them to pass to the top level uncaught and halt execution. Restricting yourself to throwing classes which inherit from `exception` will make your code easier to understand for others, and compatible with other code bases. If you need values of other types to be reported with your exception then you can include them as member variables in you own exception class (see below). ## Defining Our Own Exceptions -We've seen above that we can differentiate between different kinds of exceptions by checking for different expception classes, and then execute different error handling code accordingly. This is a very powerful feature of exceptions that we can extend further by defining our own exception classes to represent cases specific to our own applications. When we define our own exceptions, they should inherit from the `std::exception` class, or from another class which derives from `std::exception` like the standard library exceptions listed above. You should be aware though that if you inherit from a class, for example `runtime_error`, then your exception will be caught by any `catch` statements that catch exceptions of that base class. +We've mentioned above that we can differentiate between different kinds of exceptions by checking for different exception classes, and then execute different error handling code accordingly. This is a very powerful feature of exceptions that we can extend further by defining our own exception classes to represent cases specific to our own applications. When we define our own exceptions, they should inherit from the `std::exception` class, or from another class which derives from `std::exception` like the standard library exceptions listed above. You should be aware though that if you inherit from a class, for example `runtime_error`, then your exception will be caught by any `catch` statements that catch exceptions of the base classes (`runtime_error` or `exception`). Exceptions that we define should be indicative of the kind of error that occur. Rather than trying to create a different exception for each function that can go wrong, create exception classes that represent kinds of problems, and these exceptions may be thrown by many functions. When creating new exception classes it is a good idea to think about what is useful for you to be able to differentiate between. @@ -225,9 +231,9 @@ To override `what()` the type declaration is: const char * what() const noexcept {...} ``` -- The return type is `const char *` i.e. a constant `char` pointer. This is a C-style array of characters, and is how strings were handled in C. +- The return type is `const char *` i.e. a constant `char` pointer. This is a C-style array of characters, and is how strings are handled in C. - The `const` after the function name enforces that no member variables can be changed inside the function body i.e. `what()` cannot change any of the exception's data. - - You can mark special variables to be modifiable even in `const` functions, by declaring them `mutable` e.g. `mutable int x`. + - You can mark special variables to be modifiable even in `const` functions, by declaring them `mutable` e.g. `mutable int x`. Usually this is a practice that is best avoided as it makes your code more difficult to reason about. Derived classes `runtime_error`, `logic_error`, and `failure` all contain constructors which take arguments of type `const string &` (reference to a constant string), which sets the error message returned by `what()`. These can be useful if you want to be able to set the message without overriding the `what()` function. @@ -242,8 +248,8 @@ class FunctionDomainException: public exception FunctionDomainException(string func_name, double value) { message = "Function Domain error on function " + func_name \ - + ". Input " + std::to_string(x) + " invalid."; - bad_input = x; + + ". Input " + std::to_string(value) + " invalid."; + bad_input = value; } const char * what() const noexcept @@ -259,11 +265,11 @@ class FunctionDomainException: public exception - The constructor takes a `string` (`func_name`) to report the name of the function and a `double` (`value`) to report the value that the function failed at. - It then constructs an error message based on this information and also stores `value` value as a member variable `bad_input`. - The `what()` method is overridden to print out the string that we've constructed. -- If we catch this error we can also access `bad_input` as it is public, which may be a useful for us to be able to use numerical code rather than just printing it out. +- If we catch this error we can also access `bad_input` as it is public, which may be useful for us to be able to manipulate in numerical code rather than just printing it out. ## Control flow and memory management -We've discussed above that raising an exception will prematurely halt the execution of a function and return control to the calling function. It will also halt the execution of any calling functions until we find ourselves within a `try` block, at which point the `catch` code is executed. We should always be aware of what our program has not done if an exception is thrown. +We've discussed above that raising an exception will prematurely halt the execution of a function and return control to the calling function. It will also halt the execution of any calling functions until we find ourselves within a `try` block, at which point the `catch` code is executed. We should always be aware of what our program will not do if an exception is thrown. - If you place a try block around a function which may throw exceptions at multiple points (it may have multiple `throw` statements or make calls to a number of other functions which could themselves throw exceptions) and you are passing variables in by reference to be modified, you should be aware of the possible states that your data could be if the function is prematurely halted. Not all the changes that your function is intended to make on your data may have happened! -- When we reach the catch block, the stack memory for any functions which threw exceptions and were halted is freed (since they are now out of scope). This means that stack variables are cleaned up, and destructors for any stack variables are called (including smart pointers, the destructors for which de-allocate the data to which they point). Be aware though that if there are _raw pointers_ on the stack, the memory that they point to is not deleted (only the pointer itself is) and so if the memory that it points to is not a stack variable or also owned by a smart pointer a memory leak will occur. **This is one of the reasons why we should not use raw pointers for memory ownership.** If you do have an owning raw pointer in a function and you want to throw and exception, it is vital that you use `delete` to free the memory before throwing the exception; likewise you must be aware of any function calls that you make which could themselves throw an exception: these _must_ be caught so that you can free the memory before returning control to the call stack. \ No newline at end of file +- When we reach the catch block, the stack memory for any functions which threw exceptions and were halted is freed (since they are now out of scope). This means that stack variables are cleaned up, and destructors for any stack variables are called (including smart pointers, the destructors for which de-allocate the data to which they point). Be aware though that if there are _raw pointers_ on the stack, the memory that they point to is not deleted (only the pointer itself is) and so if the memory that it points to is not a stack variable or also owned by a smart pointer a memory leak will occur. **This is one of the reasons why we should not use raw pointers for memory ownership.** If you do have an owning raw pointer in a function and you want to throw and exception, it is vital that you use `delete` to free the memory before throwing the exception; likewise you must be aware of any function calls that you make which could themselves throw an exception: these _must_ be caught so that you can free the memory before returning control to the call stack. diff --git a/03cpp2/sec01ObjectOrientedProgramming.md b/03cpp2/sec01ObjectOrientedProgramming.md deleted file mode 100644 index df1c896f7..000000000 --- a/03cpp2/sec01ObjectOrientedProgramming.md +++ /dev/null @@ -1,153 +0,0 @@ ---- -title: Object Oriented Programming ---- - -Estimated Reading Time: 15 minutes - -# Object Oriented Programming (OOP) in C++ - -As a programming lanaguage, C++ supports multiple styles of programming, but it is generally known for _object oriented programming_, often abbreviated as _OOP_. This is handled in C++, as in many languages, through the use of classes: special datastructures which have both member data (variables that each object of that class contains and which are usually different for each object) and member functions, which are functions which can be called through an object and which have access to both the arguments passed to it _and_ the member variables of that object. - -We have already been making extensive use of classes when working with C++. Indeed, it is difficult not to! The addition of classes was the main paradigm shift between C, a procedural programming language with no native support for OOP, and C++. - -## Classes - -Classes can be used to define our own data-structures, which have their own type. We can then declare objects of this type in our program. Apart from a handful of built in types (like `int`, `double`, and `bool`), variables that we declare in C++ are instances of a class. A number of objects that we've used so far are classes defined in the standard library, like `vector` and `string`. - -Classes achieve two goals in representing concepts in programming: - -- _Abstraction_ - - Represents the essential elements of a _kind_ of object, as distinct from other kinds of objects. What are the defining properties of a type of object? - - Class defines the blueprint for every object of that kind: what information it contains and what it should be able to do. - - Objects are individual instances of a class. - - _“An abstraction denotes the essential characteristics of an object that distinguish it from all other kinds of objects and thus provide crisply defined conceptual boundaries, relative to the perspective of the viewer.”_ - Grady Booch -- _Encapsulation_ - - Methods and data that belong together and kept together. - - Provide public interface to class: how other things should be able to interact with it. - - Protects and hides data to which other things should not have access. - -## Access Specifiers in Classes - -When writing a class we can declare a member function or variable using one of three access specifiers: - -- `private`: access is private by default. The variable or function is available only within the body of this class. -- `protected`: The variable or function can be accessed within the body of this class, or within the body of any class which inherits from this class. -- `public`: The variable or function can accessed inside and outside of the definition of the class, by anything which can access the object. - -The access specifiers, `private`, `protected`, and `static`, are keywords which are used within class definitions followed by a colon (`:`) to specify access for all following members until the end of the class or another access specifier is reached. For example: - -```cpp -class myClass -{ - public: - int x; - double y; - - private: - std::string name; - - protected: - double z; -}; -``` - -- `x` and `y` are both public -- `name` is private -- `z` is protected - -If you are writing classes in C++, especially classes that will be used by other people, it's a good idea to only give people access to as much as they need and no more than that. In general: - -- Make functions and variables `private` if you can. -- You can control access to variables in a finer grained way through `get` and `set` methods than by making them public. For example you may want variables that can be inspected (write a `get` function) but not changed (no `set` function) or vice versa. -- Constructors and destructors should generally be `public`. - - -## Static Members - -Static member variables or functions are special members of a class. They belong to the class as a whole, and do not have individual values or implementations for each instance. This can be useful when keeping track of properties that are changeable and may affect the class as a whole, or for keeping track of information about a class. For example, one can use a static variable to count the number of instances of a class which exist using the following: - -```cpp -class countedClass -{ - public: - - countedClass() - { - count += 1; - } - - ~countedClass() - { - count -= 1; - } - - static int count; -}; - -int countedClass::count = 0; - -int main() -{ - auto c1 = countedClass(); - cout << countedClass::count << endl; - - auto c2 = countedClass(); - cout << c2.count << endl; - - return 0; -} - -``` -- The count is incremented in the constuctor (`countedClass()`), and so increased every time an instance of this type is created. -- The count is decremented in the destructor (`~countedClass()`), and so decreased every time an instance of this type is destroyed. -- `count` is a static variable, so belongs to the class as a whole. There is one variable `count` for the whole class, regardless of how many instances there are. The class still accesses it as a normal member variable. -- `count` also needs to be declared outside of the class definition. (This is where you should initialise the value.) -- A static variable can be accessed in two different ways: through the object (`c1.count`), or through the class namespace (`countedClass::count`) without reference ot any object. Public static variables for a class can therefore be accessed by anything which has access to the class definition, regardless of whethere there are any objects of that class. - -## Improving this class with Access Specifiers - -- A variable like `count` shouldn't be able to be changed outside of the class, as that could interfere with our counting! But we do want to be able to access the _value_ of the count, so we can tell how many there are. -- We should make `count` _private_ and make a function to retrieve the value _public_ -- Such functions are often called "getters", because they are frequently named `get...()` for some variable - -```cpp -class countedClass -{ - public: - - countedClass() - { - count += 1; - } - - ~countedClass() - { - count -= 1; - } - - static int getCount() - { - return count; - } - - private: - static int count; -}; - -int countedClass::count = 0; - -int main() -{ - auto c1 = countedClass(); - cout << countedClass::getCount() << endl; - - auto c2 = countedClass(); - cout << c2.getCount() << endl; - - return 0; -} -``` - -- `getCount()` is `public` and `static` and so can be accessed just like we accessed `count` before (through an object or through the class definition). -- `getCount()` returns an integer _by value_, so it returns a copy of `count`. We can't modify `count` through this function or the value we get back from it. -- `count` is now private, so if we try to access this directly from outside the class the compiler will raise an error. \ No newline at end of file diff --git a/03cpp2/sec02ErrorHandling.md b/03cpp2/sec02ErrorHandling.md new file mode 100644 index 000000000..d24e25cb4 --- /dev/null +++ b/03cpp2/sec02ErrorHandling.md @@ -0,0 +1,146 @@ +--- +title: Other Error Mechanisms +--- + +# Other Error Mechanisms + +There are some other ways of handling errors that are worth being aware of outside of using exceptions. In this section we discuss _return codes_, which are particularly common in C-based external libraries and legacy code from earlier versions of C++, and `std::optional`, a special type which can represent the absence of a value. + +## Return Codes + +Return codes are common in programming languages like C which do not have exceptions, and so will be frequently encountered by C++ programmers when using libraries built in C. (C is a very common languages library because C has become the lingua franca of programming languages, with most popular languages having a way to interface with C code.) + +The return codes approach can also be useful in its own right in C++, when we don't want to interrupt the program flow in the same way, or when errors are common and the exception overhead starts to become high. High frequencies of exceptions can cause particular problems for multi-threaded programming (you can read about [an example here](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2544r0.html) if you are interested). + +In the return code approach, you define a function as returning an `int`, and the convention is to return `0` for a successful execution. Any other number encodes an error of some kind, and you can thus assign different meanings to different integers. Because we've used the return value to report success/failure, the meaningful output of the function is placed into a mutable argument (reference or pointer). This is one of the disadvantages of this approach, as it obsfuscates the logical structure of the function. + +Consider this example for getting the first element of a vector (which may be empty): +```cpp +#include +#include + +using std::vector; + +int head(vector v, int &x) +{ + if(v.empty()) + { + return 1; // error code + } + else + { + x = v[0]; + return 0; // success code + } +} + +int main() +{ + vector v{5, 9, 4, 3}; + int x; + if(!head(v, x)) + { + std::cout << "It was empty." << std::endl; + } + else + { + std::cout << x << std::endl; + } + + return 0; +} +``` +- In this approach we have to declare our variable to hold the value first, and then pass it into the function. +- We need to check the output of the function to see if it is successful. Unlike an exception, if we need to handle it further up the call stack then we have to check it separately **at every level until we reach the calling scope where we can handle it**. If functions are nested, then every intermediate function between where the error occurs and where it is handled needs to check for an error, and then return early with its own error code to be checked by the function that called it. This means that we often end up with a lot more error checking code using this approach. +- Assigning appropriate error codes can be tricky, and you need to remember what they all mean. Using an accessible namespace to declare some `const` variables to give meaningful names to different error codes can be useful here, but beware that library code that you interface with will use their own conventions! + + +## `std::optional` + +Some functions might make most sense to conceptualise as returning either a _value_ or _nothing_. Examples might be getting the first element of a list (either the first element of a list or nothing if the list is empty), or looking up a value in a key-value map (return the value if the key is in the map or nothing if it isn't). These can be handled using exceptions or error codes of course, but a special _nothing_ value can be useful sometimes to explicitly represent this value. +- Having a special _nothing_ value can help to avoid unitialised variables. This can happen for example when we want to assign a new variable from some function which may fail. +- A _nothing_ value can also be very useful for class design. For example, if you have a class representing some student data, we could set a grade for a course to be either a _value_ (the mark awarded) or _nothing_ (no mark is given yet). This avoids potentially misleading data using default values. + - It can also be very useful for avoiding null pointers when objects need to point to other objects, but we'll discuss pointers next week! +- A _nothing_ value can be propagated through the rest of your calculations like any other value. If your code properly handles the _nothing_ values, this can sometimes simplify the control flow of your program by allowing all your data to go through the same pipeline. + +In C++ this approach can be handled using `std::optional` (in C++17 onwards); it is similar to the concept of the _Maybe monad_ in Haskell and a number of other similar structures in various languages. + +Like `vector`, `std::optional` uses angle brackets to declare the type of value that it can hold. `std::optional` can either hold a value, or the special value `std::nullopt`. + +The following code uses `std::optional` to get the first element (or "head") of a `vector`. +```cpp +#include +#include +#include + +using std::vector; +using std::optional; +using std::nullopt; + +optional head(const vector &v) +{ + return v.size() > 0 ? optional(v[0]) : nullopt; +} + + +// This function defines the << operator for streaming an std::option +// to an output stream such as std::cout. +std::ostream& operator<<(std::ostream &os, optional x) +{ + if(x) // this is defined as: "if x has a value" + { + os << x.value(); + } + else // otherwise it must be nullopt + { + os << "nothing"; + } + return os; +} + +int main() +{ + vector v1{5,9,4,3}; + optional x = head(v); + + vector v2; + optional y = head(v2); + + std::cout << x << ", " << y << std::endl; + + return 0; +} +``` +The output for this program will be +``` +5, nothing +``` + +## Using types to forbid errors + +It is sometimes possible to exclude the possibility of certain kinds of errors by making use of the type system. + +Take for example the factorial function, which is only defined for $n >= 0$. One approach is to accept general integers and check for errors: +```cpp +int factorial(int n) +{ + if(n < 0) + { + throw std::logic_error("Factorial is undefined for n < 0"); + } + ... +} +``` +but we can eliminate this possibility entirely by using an unsigned integer type such as `uint` which is always greater than 0: +```cpp +uint factorial(uint n) +{ + ... +} +``` +Sometimes it is necessary (or, at least, more convenient) for us to have a function like `factorial` which takes an `int` rather than `uint` and checks for errors. We may need to pass in a variable that we can't guarantee is non-negative or the variable needs to hold different values at different times including negative values. However, where possible and reasonable, it's a good idea to use restrictive types to enforce properties that you want. + +- `uint` and `size_t` are unsigned types that can't be less than 0, so can be useful for mathematical functions which take non-negative integers. +- `std::array` variables have their size determined at compile time, so there is no need to check their length as you do with `std::vector`. + +Defining our own classes can also help us to avoid errors. If you write your class' internal logic so that certain properties are always true, then you don't have to check for them in the functions that you pass those objects to. Remember to test those classes thoroughly though! \ No newline at end of file diff --git a/03cpp2/sec03CMakeBasics.md b/03cpp2/sec03CMakeBasics.md new file mode 100644 index 000000000..e69de29bb diff --git a/03cpp2/sec04UnitTesting.md b/03cpp2/sec04UnitTesting.md new file mode 100644 index 000000000..1c63bde32 --- /dev/null +++ b/03cpp2/sec04UnitTesting.md @@ -0,0 +1,152 @@ +--- +title: Testing Software +--- + +# Testing Software + +Testing is a crucial part of of software projects, and is especially critical in projects like scientific software where it is necessary to be confident about the accuracy and correctness of our results. + +Testing is usually a layered process, with: +- **unit tests** focussing on checking small, independent pieces of code, and +- **integration tests** checking larger processes that involve multiple pieces of unit tested code working together, and +- **system tests** checking that the complete system works. + +## Testing Frameworks + +Testing frameworks exist for all major languages, and you may have come across some of them before. In this course we will be using the [Catch2](https://github.com/catchorg/Catch2) library for writing tests. If, in the future, you decide (or need) to use a different framework such a [Google Tests](https://github.com/google/googletest), then the process will be the same but with some changes in syntax. The same approach can also be taken to other languages like Python using frameworks like [pytest](https://docs.pytest.org/en/7.4.x/), which some of you may have used before. + +## Installing Catch2 + +**Before this week's class you need to install the Catch2 testing library.** + +You can clone the [Catch2 repository here](https://github.com/catchorg/Catch2). To install you should complete the following steps: + +1. Clone the repository. +2. Move into the Catch2 folder in your terminal. +3. `cmake -B build -DBUILD_TESTING=OFF` +4. `cmake --build build` +5. If you have permissions and want to install system wide you can run `cmake --install build/`. Otherwise run `cmake --install build/ --prefix install_path` where you replace `install_path` a path to a folder of your choice. + +To make Catch2 available in CMake, in your top level CMakeLists.txt include the following line: + +`find_package(Catch2 3 REQUIRED)` + +or if you have not installed system wide, use the following line: + +`find_package(Catch2 3 REQUIRED PATHS install_path)` + +replacing `install_path` with the path to your chosen install folder. This allows cmake to find your Catch2 installation, including the header files and the compiled library. + +When you create your executable for your test files, you'll need to link with the Catch2 library (along with any other libraries you need): +```cmake +add_executable(test_executable) +target_sources(test_executable PRIVATE test_source.cpp) +target_link_libraries(test_executable PUBLIC Catch2::Catch2WithMain) +``` + +You can find more information about using Catch2 in the documentation on their [github page](https://github.com/catchorg/Catch2). + +## Unit Testing Principles + +Unit tests check the correctness of the smallest pieces of code, for example an individual function or class with no further dependencies. Such a function may have multiple _test cases_, which check for different aspects of the behaviour, or which check the behaviour under different circumstances/inputs. + +When writing unit tests for a function you will want to consider: +- What are some where you _know_ the expected output? It is important not to write circular tests which end up comparing the result of the code to itself! +- You should always test the output on multiple inputs. + - Make sure that you don't only check trivial cases. Often special cases can be easier to calculate the expected values for, but don't require all the code to be correct to get the answer you expect. For example settings values in your input vector to 0, or passing an empty vector, might get you the right result even if your code processing that vector is wrong. Ask yourself if the outputs that you're checking are dependent on all the different parts of the code you are testing being correct. If not, then add more test cases that do depend on those bits of code! +- Break your set of possible inputs into different cases, and pay particular attention to edge cases. + - You may want to check for example how your function behaves for positive numbers, negative numbers, and 0. For some functions you may want to check how it handles very large or small inputs (if there is a risk of overflow/underflow). + - Edge cases are the most common places for errors to appear. Consider a function that takes a vector of data and updates it so that each value is replaced by the average of two values on either side of it. What should happen to the values at either end of the vector? Does this happen correctly? +- Checking for failure is as important as checking for success! Many functions that we write are not valid for all inputs, and we should check that they behave properly when given invalid inputs. + - Check that functions which should throw exceptions under certain circumstances do indeed throw those exceptions. + - Common failure cases for various kinds of functions include providing negative or 0-value inputs, indices which are out of bounds, or empty vectors/strings. + - If you have designed a class to have certain properties, check that you cannot construct an invalid class, and that its properties are maintained correctly. +- Ask yourself if another function, which does something different to what you intended, could pass your tests. If so, consider making your tests more robust! + +Writing comprehensive tests takes some practice, and is highly dependent on the kind of code that you have written. **Always look at your code critically, consider how it behaves in different circumstances.** + +## Integration and System Tests + +Once you have thoroughly tested the smallest units of your code, you should also write tests which test more complex components. These may be functions which call multiple sub-functions for example. You should increase the size of the components you are testing piece by piece until you reach full system tests. System tests check the behaviour from input to output, and often include mocking up things like user input. + +## Testing Syntax + +Most of the important syntax that you will need to know about are contained within two header files: +```cpp +#include +#include +``` +- `catch_test_macros.hpp` contains the macros that we need in order to declare test cases and do standard checks. +- `catch_matchers_floating_point.hpp` contains some extra functionality for testing floating point values. This is different from testing other values because **floating point arithmetic is almost never completely accurate, so instead we have to test whether results are within some relative or absolute tolerance.** + - There is additional special functionality for other things like strings, containers (such as vector) and more. `` can be useful if you want to have access to all of these. + - I recommend writing the line `using namespace Catch::Matchers` if you are using any of this functionality to avoid namespace clutter in your test files. (Otherwise you need to write things like `Catch::Matchers::WithinRel` every time you check a float which gets very cumbersome and makes things more difficult to read.) + +Here's an example using our `vector_functions` code from last week's class, with some additional example tests to check floating point and vector examples. + +```cpp +#include +#include +#include "vector_functions.h" +#include + +using namespace Catch::Matchers; + +TEST_CASE( "Counting with loop is correct", "[ex1]" ) { + std::vector example = {5}; + REQUIRE( countMultiplesOfFive(example) == 1); +} + +TEST_CASE( "Adding elements to a vector", "[ex2]" ) +{ + std::vector starting_vector; + addElements(starting_vector, 5, 4); + REQUIRE( starting_vector.size() == 4); + for(auto &&i : starting_vector) + { + REQUIRE(i == 5); + } +} + +TEST_CASE("Testing floats", "[matchers]") +{ + double x = 1.0/2.0; + REQUIRE_THAT(x, WithinRel(0.5, 0.0001)); +} + +TEST_CASE("Testing vectors", "[matchers]") +{ + std::vector v{1, 2, 3, 4, 5}; + std::vector v_check(v.size()); + std::transform(v.begin(), v.end(), v_check.begin(), [](int z){return z > 0;}); + REQUIRE_THAT(v_check, AllTrue()); +} +``` +- `TEST_CASE` declares a new test case within this test file. Each test case has a description and optional tags. The tags can be used to run subsets of tests. Tests can have more than one tag, and the same tag can be used for multiple tests. +- `REQUIRE` checks that the statement within the brackets is true. +- `REQUIRE_THAT` takes a value first, and then a matcher expression. Matcher expressions do something more complex than just a simple equality test. + - `WithinRel` is a matcher expression which checks that takes two params: the expected value and the relative tolerance. + - `AllTrue` is a matcher expression that takes no arguments, and is used to check that every element in an iterable container is true. + +### REQUIRE and CHECK + +When using the `REQUIRE` macro, a `TEST_CASE` will terminate if it fails, and execution will move on to the next `TEST_CASE`. When using the `CHECK` macro, the `TEST_CASE` will continue on whether it passes or fails. + +- If your `TEST_CASE` does independent tests which don't require the previous checks to have passed in order to continue, then use `CHECK`. +- If you need a particular condition to pass before continuing -- a good example is checking an object being correctly set up -- then use `REQUIRE`. +- Similarly we have both `REQUIRE_THAT` and `CHECK_THAT` for matchers. + +### Testing Floating Point + +You can use `WithinAbs(value, tolerance)` and `WithinRel(value, tolerance)` to check that floating point numbers are within some reasonable tolerance of what you expect. When considering what tolerance to use, you should think about: +- What is the precision of your floating point numbers? (`float` or `double`?) +- How much error might accumulate in your calculation? +- How much error is acceptable to you? What precision is actually required? + +### Testing for Exceptions + +You should test that exceptions are thrown when they should be, and in some cases you explicitly check that exceptions are not thrown when they shouldn't be. +- Test any cases where exceptions should be thrown. + - Use `REQUIRE_THROWS_AS(expression, exception_type)` or `CHECK_THROWS_AS(expression, exception_type)` to check that the correct exception is thrown. For example, if we have a factorial function which throws a `domain_error` if the factorial is undefined (e.g. negative numbers) then you could write `CHECK_THROWS_AS(factorial(-1), std::domain_error)`. +- If a constructor can throw an exception, you should check that it does not throw an exception for a valid instantiation using `CHECK_NOTHROW` or `REQUIRE_NOTHROW`. + +**You can consult the Catch2 documentation for even more macros and ways of testing your code.** \ No newline at end of file diff --git a/01projects/sec02SoftwareBuilds.md b/03cpp2/sec05SoftwareBuilds.md similarity index 100% rename from 01projects/sec02SoftwareBuilds.md rename to 03cpp2/sec05SoftwareBuilds.md diff --git a/01projects/sec03CMakeBackground.md b/03cpp2/sec06CMakeBackground.md similarity index 100% rename from 01projects/sec03CMakeBackground.md rename to 03cpp2/sec06CMakeBackground.md diff --git a/01projects/sec04CMakeHelloWorld.md b/03cpp2/sec07CMakeHelloWorld.md similarity index 100% rename from 01projects/sec04CMakeHelloWorld.md rename to 03cpp2/sec07CMakeHelloWorld.md diff --git a/01projects/sec05BuildHelloWorld.md b/03cpp2/sec08BuildHelloWorld.md similarity index 100% rename from 01projects/sec05BuildHelloWorld.md rename to 03cpp2/sec08BuildHelloWorld.md diff --git a/04cpp3/index.md b/04cpp3/index.md index 1f2153066..753955c26 100644 --- a/04cpp3/index.md +++ b/04cpp3/index.md @@ -4,19 +4,15 @@ title: "Week 4: Modern C++ (3)" ### This Week -This week we will continue to explore object oriented programming, delving into design considerations when writing classes for our programs. We'll also look an another method for writing flexible and re-usable code in C++ called _generic programming_ with templates. +This week we will look at, compare, and contrast two approaches to polymorphism in C++. We'll look at **runtime polymorphism**, also known as **dynamic polymorphism**, through the use of _inheritance_ to define sub-types with special behaviours. We'll also discuss **compile-time polymorphism**, also known as **static polymorphism**, through the use of _templates_ (and sometimes therefore referred to as _template meta-programming_) to define generic code which compiler can use to automatically generate specialised code for different types. -* [Designing Classes](./sec01DesigningClasses.html) - - Properties of high quality code - - Guiding principles - - Applications of abstract classes -* [Templates](./sec02Templates.html) - - Class templates +* [Inheritance](./sec01Inheritance.html) + - Creating sub-types using inheritance + - Overriding functions + - Runtime polymorphism with virtual functions + - Abstract classes for interfaces +* [Templates](./sec03Templates.html) - Function templates - - Function overloading + - Class templates + - Operator & function overloading - Compiling templated code -* [C++ Code Design](./sec03CppCodeDesign.html) - - General C++ Principles - - Run-time and Compile-time Polymorphism - - Composition and Inheritance - - Useful Resources diff --git a/03cpp2/sec02Inheritance.md b/04cpp3/sec01Inheritance.md similarity index 74% rename from 03cpp2/sec02Inheritance.md rename to 04cpp3/sec01Inheritance.md index f15a3d97d..398ae75cc 100644 --- a/03cpp2/sec02Inheritance.md +++ b/04cpp3/sec01Inheritance.md @@ -1,19 +1,18 @@ --- -title: Inheritance and Polymorphism +title: Inheritance --- -Estimated Reading Time: 50 minutes +# Creating Sub-types with Inheritance -# Inheritance and Polymorphism +Inheritance is one of the most important concepts in object oriented design, which brings a great deal of flexibility to us as programmers. A class defines a type of object, and a class which inherits from it defines a sub-type of that type. For example, we might have a class which represents shapes, and sub-classes which represent squares, circles, and triangles. Each of these are shapes, and so should be able to be used in any context that simply requires a shape, but each will have slightly different data needed to define it and different implementations of functions to calculate its perimeter or area. -Inheritance is one of the most important concepts in C++ classes, which brings a great deal of flexibility to us as programmers. A class defines a type of object, and a class which inherits from it defines a sub-type of that type. For example, we might have a class which represents shapes, and sub-classes which represent squares, circles, and triangles. Each of these are shapes, and so should be able to be used in any context simply requires a shape, but each will have slightly different data needed to define it and different implementations of functions to calculate its perimeter or area. - -If we have a class to represent shapes, then any function which takes an object of our shape class should be able to take a circle, a square, or a triangle. This ability to use different types in the same context is called **polymorphism** and is a key concept in many programming paradigms. In C++ we will achieve it using inheritance. +If we have a class to represent shapes, then any function which takes an object of our shape class should be able to take a circle, a square, or a triangle. This ability to use different types in the same context is called **polymorphism** and is a key concept in many programming paradigms. In C++ one of the key ways we will achieve it is by using inheritance. ## When Should Inheritance Be Used? -- Inheritance should be used only when you want to declare that one class is a sub-type of another class. Essentially `B` may inherit from `A` only if `B` is a kind of `A`. +- Inheritance should be used only when you want to declare that one class is a sub-type of another class. Essentially **`B` may inherit from `A` only if `B` _is a kind of_ `A`.** - A common example is that the classes `Circle` and `Square` may both derive from the class `Shape`. But neither `Circle` nor `Square` should inherit from one another! + - Consider for example a class `Country`, which may have both an area and a perimeter. Although it shares some properties with `Shape`, it should almost certainly **not** inherit from `Shape`, because a `Country` is not a kind of `Shape`, and we wouldn't expect a `Country` to be substitutable everywhere that a `Shape` is. This is an example of using the type system to our advantage: we shouldn't allow a `Country` to be passed into a `Shape` function, because we know it is the wrong kind of object even if it shares some (or even all) properties. We are using the type system to impart information that we understand about the objects we are creating and modelling, and discriminate between representations of different kinds of thing. - The **Liskov Substitution Principle** is one good guiding principle. - If `B` is a sub-type of `A`, then replacing an object of type `A` with an object of type `B` should not break your program. - In this case a `B` object can be considered a kind of `A` object, but not the other way around. @@ -28,7 +27,7 @@ If we have a class to represent shapes, then any function which takes an object - Don't use inheritance if you want a class to _have_ an instance of another class as a component. - It should be achieved by having a member variable of that type, or a pointer to an object of that type. - For example, squares _have_ edges, so a `Square` class could have _members_ which are of an `Edge` type class. But `Edges` aren't squares, so `Edge` shouldn't derive from `Square` (or vice versa). - - This is called *composition* when the lifetime of the component is controlled by the class, and *aggregation* when the the component has an independent lifetime. + - This is called *composition* when the lifetime of the component is controlled by the class, and *aggregation* when the component has an independent lifetime. - A class representing a room has walls, which don't exist independently of the room and so can be represented using composition. The walls could be represented using member variables of type Wall, or pointer to Walls, possibly in a container. - A room can also have a table, which could be moved to another room or thrown away, and hence exists independently of the room and can be represented using aggregation. There should be a pointer to an object of type Table, and some means to check that the Table is still in scope. - Inheritance is only for when you want a class to _be_ a kind of another class. @@ -73,15 +72,15 @@ You can observe the creation and destruction of objects of base and derived clas ## Overriding Inherited Functions -Unlike the constructor and destructor, most functions can be completely overridden by the base class. Calling the function in the derived class will not make any calls to the same function in the base class - the functionality is completely replaced. This is straight-forward to do: if we implement a function with the same name and signature as the base class (same type, name, number of arguments, and types of arguments) then this function will "override" the definition that would be inherited from the base class. +Unlike the constructor and destructor, most functions can be completely overridden by the derived class. Calling the function in the derived class will not make any calls to the same function in the base class - the functionality is completely replaced. This is straight-forward to do: if we implement a function with the same name and signature as the base class (same type, name, number of arguments, and types of arguments) then this function will "override" the definition that would be inherited from the base class. Function overriding is fundamental to this polymorphic style of programming because this is what allows each sub-class to behave uniquely when placed in the same context. ## Polymorphism -Polymorphism is the ability to use multiple types in the same context in our program; in order to achieve this we must only access the common properties of those types through some shared interface. The most common way to do this is to define a base class which defines the necessary common properties, and then have sub-classes which inherit from the base class which represent different kinds of objects which can implement this interface. This is caled *sub-type polymorphism*, and is one of the most common forms of polymorphism. +Polymorphism is the ability to use multiple types in the same context in our program; in order to achieve this we must only access the common properties of those types through some shared interface. The most common way to do this is to define a base class which defines the necessary common properties, and then have sub-classes which inherit from the base class which represent different kinds of objects which can implement this interface. This is called *sub-type polymorphism*, and is one of the most common forms of polymorphism. -By exploring polymorphism we can also understand the behaviour, and some of the limitations, of the straightforward model of inheritence that we have used so far. +By exploring polymorphism we can also understand the behaviour, and some of the limitations, of the straightforward model of inheritance that we have used so far. Let's assume that we have some class `Shape`, and derived classes `Circle` and `Square`. @@ -92,7 +91,7 @@ class Shape Shape(){} public: - Shape(double in_perimeter, double in_area) + Shape(double P, double A) { perimeter = P; area = A; @@ -121,7 +120,7 @@ class Shape class Circle : public Shape { public: - Circle(double in_radius) : radius(r) + Circle(double r) : radius(r) { perimeter = 2 * M_PI * radius; area = M_PI * radius * radius; @@ -133,13 +132,13 @@ class Circle : public Shape } protected: - double m_radius; + double radius; }; class Square : public Shape { public: - Square(double in_width) : width(w) + Square(double w) : width(w) { perimeter = 4 * width; area = width * width; @@ -166,12 +165,9 @@ class Square : public Shape Now let's say that we want to have a list of shapes, in the form of a vector, and get the area for each one. ```cpp -void GetShapeAreas(vector shapes) +void PrintShapeArea(Shape shape) { - for(auto &shape : shapes) - { - cout << shape.getArea() << endl; - } + cout << shape.getArea() << endl; } int main() @@ -179,15 +175,12 @@ int main() Circle C = Circle(5.9); Square S = Square(3.1); - vector shapes; - shapes.push_back(C); - shapes.push_back(S); - - GetShapeAreas(shapes); + PrintShapeArea(C); + PrintShapeArea(S); } ``` -- When a `Circle` or `Square` is placed into the `vector` container, it is cast to a `Shape` (the base class). +- When a `Circle` or `Square` is passed into `PrintShapeArea`, it is cast to a `Shape` type (the base class). - It will lose any additional information or methods associated with the derived class. - The `Circle` and the `Square` both have access to the `perimeter` and `area` member variables, as well as their respective "getters". - The correct area will reported because the `area` member variable is set in the constructor, and the derived constructor has been called when the object was instantiated. @@ -195,12 +188,9 @@ int main() Whenever we use a derived class in place of a base class, we implicitly cast to the base type and therefore can lose important information and behaviour defined in the derived class. In this example, we have separate `printInfo` functions for each of our classes. We run into a problem if we want to print this information for a list of `Shape` objects containing both `Circle` and `Square` objects. ```cpp -void GetShapeInfo(vector shapes) +void GetShapeInfo(Shape shape) { - for(auto &shape : shapes) - { - shape.printInfo(); - } + shape.printInfo(); } int main() @@ -208,16 +198,13 @@ int main() Circle C = Circle(5.9); Square S = Square(3.1); - vector shapes; - shapes.push_back(C); - shapes.push_back(S); - - std::cout << std::endl; - C.printInfo(); S.printInfo(); - GetShapeInfo(shapes); + std::cout << std::endl; + + GetShapeInfo(C); + GetShapeInfo(S); } ``` @@ -234,7 +221,7 @@ Shape; Area = 9.61 m^2, Perimeter = 12.4m. - When we call `printInfo()` from the derived class objects directly, we get their detailed information including the type of shape and the radius or width. - When we do the same on our objects within our vector, we only have access to the base class, and therefore we call the base class version of this method. -In this case we have lost our specialised functionality for our derived classes when placed in a polymorphic context! In order for polymorphism to be really useful in C++, we need a way to retain the overriddent functions for the derived classes, even when we are treating them in the more generalised context of a function or container which takes their base class. +In this case we have lost our specialised functionality for our derived classes when placed in a polymorphic context! In order for polymorphism to be really useful in C++, we need a way to retain the overridden functions for the derived classes, even when we are treating them in the more generalised context of a function or container which takes their base class. We shall see in the next section how we can make use of polymorphism whilst still accessing the functions of the derived class! @@ -243,14 +230,15 @@ We shall see in the next section how we can make use of polymorphism whilst stil Our current method of overriding and calling functions in the way described above is clearly insufficient in many cases where we want to use an object of a derived class in a piece of code which deals with the base class. Take for example a function that takes an argument of base type `Shape`: - We often don't want to pass our derived class by value: this will attempt to copy the object into a new object of type `Shape`, so any overrides will be lost. -- We can (and should) pass our argument by reference or as a pointer. However, the function itself will still be treating the object as being of type `Shape` and hence will call the `Shape` versions of any functions. +- We should instead pass our argument by reference (or as a pointer, which we'll discuss in a later week). This will avoid the copying into a fresh object and instead will just pass the address in memory where the object we want to pass is stored. However, the function itself will still be treating the object as being of type `Shape` and hence will call the `Shape` versions of any functions. -We can solve this problem by declaring a member function `virtual` in the base class. In this case, the function is accessed in a different way to normal. Function definitions have addresses, and normally when a member function of a class is called the definition of that function for that is just looked up. So if we are using a `Shape *` pointer to an object, even if that object was created of type `Circle`, we will still look up the definition of any functions for `Shape`, since that's the class that we're using. For virtual functions however, each object will store the address of the definition of the function as part of its data (this data is called a "virtual table"). If the object is created as an instance of the base class, this will be the address of the base function, but if the object is created as an instance of a derived class, then this will be the address of the derived function. When we call the function on the object, it will execute the function at the address stored in the virtual table, which is individual to the instance of the object, rather than using an address which applies to the whole class. This means it doesn't matter if we are using a `Shape *` pointer or `Circle *`, it will still used the derived function for the class `Circle` because that was the address put into the virtual table when the object was created. +We can solve this problem by declaring a member function `virtual` in the base class. In this case, the function is accessed in a different way to normal. Function definitions have addresses, and normally when a member function of a class is called the definition of that function for that is just looked up. So if we are using a `Shape &` reference to an object, even if that object was created as type `Circle`, we will still look up the definition of any functions for `Shape`, since that's the class that we're using. For virtual functions however, each object will store the address of the definition of the function as part of its data (this data is called a "virtual table"). If the object is created as an instance of the base class, this will be the address of the base function, but if the object is created as an instance of a derived class, then this will be the address of the derived function. When we call the function on the object, it will execute the function at the address stored in the virtual table, which is individual to the instance of the object, rather than using an address which applies to the whole class. This means it doesn't matter if we are using a `Shape &` reference or `Circle &`, it will still used the derived function for the class `Circle` because that was the address put into the virtual table when the object was created. This is also why **passing a reference (or pointer) is necessary for this to work**. If we pass by value we will create a _new_ object of type `Shape`, and because it is of type `Shape` the new object's virtual table will link to the `Shape` implementation. If we pass a _reference_, then the function will instead look at the memory location of the original object, and therefore look in the original object's virtual table, and thus find the implementation for the derived class. Virtual functions open up fully polymorphic behaviour for our classes, and are important whenever a object of a derived class might be treated as a member of a base class, including: - Passing objects of derived class to functions which take objects of base class (by reference or pointer). -- Placing a pointer to an object of derived class in a container (such as `vector`) of pointers to the base class. +- Defining a container of objects which can be of different derived classes by declaring a container using the base class. + - We will return to this technique later when we discuss pointers, you cannot have a container, such as `vector`, of references. Nevertheless it is good to be aware of this use case now as it is a very common way for polymorphism to come in handy! **N.B.** Special consideration should be given to _virtual destructors_. **If your class is inherited from, the destructor should usually be virtual.** We can point to an object of the derived class using a pointer of the type of `Base *`. If we `delete` this base pointer to free the memory then _only the base class destructor will be called_, and anything that needs to be cleaned up by the derived destructor will not happen. If the destructor is virtual, then the derived destructor will be called (which also calls the base destructor), and so any necessary clean up will happen. If you use _Smart Pointers_ to initialise your object then the correct (derived) destructor should be used even if the base destructor is not virtual. @@ -258,8 +246,10 @@ Virtual functions open up fully polymorphic behaviour for our classes, and are i Abstract classes are special cases of classes which have _virtual methods with no implementation_. Such functions are called **pure, virtual functions**. Such classes are abstract in the sense that they cannot be instantiated: we cannot create an object which is an instance of an abstract class because it has undefined functions and therefore the object to be instantiated is not fully defined. We can only instantiate objects of _derived classes_ which have implemented _all_ missing functionality. +- Abstract classes can be used when we want to define a **type** of object where any instance must be one of a set of **concrete sub-types**. + - They are often useful for modelling abstract concepts defined by some shared properties. For example, many different things are animals, but every animal alive is a specific species, i.e. sub-type, of animal. So we don't want to be able to instantiate an "animal" type object without declaring its species as well: the derived type is concrete and can exist, but the base type is abstract and merely denotes membership of a broader type class. - Abstract classes are any class which has at least one pure, virtual function - - A function declared pure by setting it `= 0` in the definition + - A function is declared pure by setting it `= 0` in the definition - e.g. `virtual int myPureVirtualFunction(int a, int b) = 0;` - Abstract classes allow us to model interfaces which have no default (base) implementation but which may have many possible implementations. - Although abstract classes cannot be instantiated on their own, they still have constructors and destructors, which are called in the same way as other base classes. These can be used to set or clean up data present in the definition of the abstract class. @@ -291,7 +281,8 @@ class Circle : public Shape void printInfo() { - cout << "Circle; Radius = " << m_radius << "m, Area = " << m_area << " m^2, Perimeter = " << m_perimeter << "m." << endl; + cout << "Circle; Radius = " << m_radius << "m, Area = " << m_area << " m^2, Perimeter = " + << m_perimeter << "m." << endl; } double getArea() @@ -325,7 +316,8 @@ class Square : public Shape void printInfo() { - cout << "Square; Width = " << width << "m, Area = " << area << " m^2, Perimeter = " << perimeter << "m." << endl; + cout << "Square; Width = " << width << "m, Area = " << area << " m^2, Perimeter = " + << perimeter << "m." << endl; } protected: diff --git a/04cpp3/sec03CppCodeDesign.md b/04cpp3/sec03CppCodeDesign.md deleted file mode 100644 index c94799e0a..000000000 --- a/04cpp3/sec03CppCodeDesign.md +++ /dev/null @@ -1,65 +0,0 @@ ---- -title: C++ Code Design Summary ---- - -# C++ Code Design Summary - -This is the end of our coverage of basic C++ language features. In the coming weeks, we'll explore the use of external libraries, tools such as debuggers and profilers, and writing performant code using optimisations and parallel programming. Now is a good time to reflect on some of the features that we've learned about and how they fit together into an overall code design. - -## General C++ Principles - -- Separate function / class declarations and implementations into header (.h) and source (.cpp) files. -- Use smart pointers for data-owning pointers: manual memory management should be minimised. -- Check for standard implementations of functions before writing your own: things like sorting are already well covered! -- The standard library offers performant containers such as `vector`, `array`, and `map`. -- Make use of modern C++ features like range based loops, `auto` type inference, and anonymous functions where they make your code easier to understand or more flexible. -- Don't import entire large namespaces like `std` as they risk name clashes. -- Code should be modularised: - - Functions should achieve a single task. - - Classes should bundle together data and functionality necessary to represent a single concept. - - Use unit-testing to test individual pieces of your program independently. - - If you start repeating yourself in your code, try to refactor so that repeated segments are replaced with function calls. -- Program your classes with flexibility in mind by using patterns like dependency injection. -- Make use of features like intefaces and templates for flexible and reusable code. -- Programming solutions are not one size fits all: think carefully about your problem, the use case that you are developing for, and how you feel you can best serve your priorities and reflect the logical structure of your model in C++. - -## Run-time and Compile-time Polymorphism - -Now that we've met both inheritance based run-time polymorphism and generic programming through templates, it's worth looking at the similarities and differences between the two. -- Polymorphism allows for differences in behaviour to be decided at run-time. - - Behavioural differences are encoded inside classes which are related by inheritance. - - A single polymorphic function can operate on a base type and all its derived types. This is usually achieved by passing a pointer to the base type and calling virtual functions. -- Templates (and function overloading) allow for differences in behaviour to be decided at compile-time. - - Behavioural differences are encoded into the external functions (or classes) which make use of the templated or overloaded type. The types which can be used aren't generally related by inheritance, but merely need to fulfil the functionality demanded in the templated code. - - Templates generate separate classes / functions for every different template parameter it is called with. - -There is a difference between a code which needs to have different behaviour with different objects not knowing ahead of time the exact type of that object (run-time polymorphism) and code which can be applied to applied to different types in different parts of the program, but does not require those types to be substitutable at run-time (compile-time polymorphism e.g. templates and function overloading). For example, you may well use the (overloaded) `+` operator to add integers together, and to concatenate strings, but you are unlikely to process data which could be _either_ an int or a string without knowing which it will be. - -## Composition and Inheritance - -We've seen this week that we can use multiple inheritance to implement multiple interfaces, which can lead to difficulties like the diamond problem, as well as making our model increasingly complex. While we can in fact inherit from an arbitrary number of base classes, it risks collisions between class namespaces and general confusion over the purpose and nature of an object. Multiple inheritance should only be used when motivated by genuine substitution (an "is-a" relationships, one class is a sub-type of the other) and a meaningful polymorphic use case. If faced with a multiple inheritance use case, consider whether it should in fact be represented as a chain of inheritance, or whether the functionality should in fact be refactored into a composition instead. - -Inheritance is sometimes misused by C++ programmers to share functionality between classes where composition would be clearer and more effective. Composition representing functionality is particularly powerful when combined with templates as we can still write a single piece of code which can be re-used with many types. - -- Classes with overlapping functionality don't necessarily need to be related by some base class. -- Classes should only be related by inheritance **if these classes should be interchangeable at some level** (i.e. can be substituted into the same place) in your code. For example, if we need a container such as a `vector` to be able to store and iterate over a diverse set of objects which are related by a core set of properties defined in a base class. -- Mere sharing of functionality can often be better represented by wrapping said functionality in a class and including it in your other classes by composition. - - For example many classes will need to store data in a container such as a `vector`, but that does not mean they should inherit from the container class! They should have an instance of that container where they can store their data. -- Multiple inheritance is generally limited to implementing two distinct, usually abstract, interfaces. An example of multiple inheritance in the C++ standard library is `iostream` (input/output stream) inherits from `istream` (input stream) and `ostream` (output stream). (See [`iostream` documentation](https://cplusplus.com/reference/istream/iostream/) and [C++ core guidelines on multiple inheritance](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rh-mi-interface).) -- Inheritance is good at defining what a class _is_, but you can use composition for things that your class _makes use of_. - -## Useful References - -### C++ Core Guidelines - -There are many differing opinions about what exactly constitutes "good practice" in C++, but a good place to start looking is generally the [C++ core guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines). - -These guidelines are co-written by the original designer of C++ and are quite extensive, but you can select individual topics to explore when you are unsure of things. - -### Effective Modern C++ - -The book [Effective Modern C++](https://www.oreilly.com/library/view/effective-modern-c/9781491908419/) is a good introduction to C++ up to the C++14 standard, and may be of help if you want to spend more time working on your C++ fundamentals. (Almost all of the features that we have covered in this course except for `std::function` are present in C++14.) - -### Design Patterns - -The book [Design Patterns](https://www.oreilly.com/library/view/design-patterns-elements/0201633612/) provides many examples of frequently occuring design solutions in object oriented programming that we have not covered in these notes. If you're comfortable with the ideas we've covered in C++ and want to improve your object-oriented software engineering skills, this book may be helpful. \ No newline at end of file diff --git a/04cpp3/sec02Templates.md b/04cpp3/sec03Templates.md similarity index 88% rename from 04cpp3/sec02Templates.md rename to 04cpp3/sec03Templates.md index 0e1ef9d7c..5aea6546b 100644 --- a/04cpp3/sec02Templates.md +++ b/04cpp3/sec03Templates.md @@ -13,7 +13,7 @@ Templates in C++ come in two main kinds: - Function Templates - Class Templates -When a class or function template is used to instantiate an concrete class or function using a specific type, a new class or function definition is created for each type with which the template is instantiated. So unlike our inheritance based run-time polymorphism, where we have one function which can take multiple classes with overlapping definitions (defined by inheritance from a base class), now we use one piece of code two generate multiple separate functions (or classes), each of which accepts a different type. This is why the compiler must know all the types which are to be used in templated functions at compile time. This is sometimes known as "static polymorphism". +When a class or function template is used to instantiate an concrete class or function using a specific type, a new class or function definition is created for each type with which the template is instantiated. So unlike our inheritance based run-time polymorphism, where we have one function which can take multiple classes with overlapping definitions (defined by inheritance from a base class), now we use one piece of code to generate multiple separate functions (or classes), each of which accepts a different type. This is why the compiler must know all the types which are to be used in templated functions at compile time. This is sometimes known as "static polymorphism". ## Using Templates with Classes @@ -41,11 +41,20 @@ class myClassTemplate }; ``` - `T` is the template parameter, and the `typename` keyword tells us that `T` must denote a type. (You can equivalently use the `class` keyword.) - - Do note that you don't need to call your template parameter `T`; like function parameters or other variables, it can have any name. It's good to give it a more meaningful name if the type should represent something in particular, for example `matrixType` could be the name if your templated code deals with arbitrary types represnting matrices. This is especially useful when using templates with multiple template parameters! + - Do note that you don't need to call your template parameter `T`; like function parameters or other variables, it can have any name. It's good to give it a more meaningful name if the type should represent something in particular, for example `matrixType` could be the name if your templated code deals with arbitrary types representing matrices. This is especially useful when using templates with multiple template parameters! - We can then use `T` like any other type inside the body of the class definition. -- Additional template parameters can appear in the angle brackets in a comma separated list e.g. `template`. -- Template parameters do not have to be `typename`. You can also have a template parameter that is an `int`, or a `bool`, or any other type. These can be used to define special versions of classes with separate implementations when provided with particular values. For example we might have `template` to define different classes depending on - the maximum size of data it will accept. This kind of template parameter is less common. +- Additional template parameters can appear in the angle brackets in a comma separated list e.g. `template`. This is how e.g. `std::map` works. + +**Template parameters do not have to be `typename`, i.e. we are not limited to simply templating types.** You can also have template parameters that are values such as an `int`, or a `bool`, or any other type. These can be used to define special versions of classes with separate implementations when provided with particular values. For example we might have `template` to define different classes depending on the maximum size of data it will accept. +- `std::array` is a good example of a class template which take both a type and a value: the type of the elements and the number of elements in the array. +- Having values as template parameters means that they must be constants known at compile time. +- Using template parameters which are values can allow you to leverage to type system to enforce correctness on your program. For example, if your program models objects in 3D space, then you will need a representation of a 3-vector. If you use `std::vector` then these vectors could be any size, so you have to make sure manually that no vectors of other sizes can sneak into your program. If you use `std::array` to represent a 3-vector then the compile will enforce that all positions, velocities, and so on are 3 dimensional. (If you work in general relativity, then this can also help you define different types for 3-vectors (`std::array`) and 4-vectors (`std::array`)!) + +**N.B.** Templates which have many parameters (types or values) can make type names quite long, so if there is something that you want to use frequently you may consider giving it an alias using the `using` syntax: +```cpp +using Vec3 = std::array; +``` +This can also make your type names more meaningful to people reading your code. ## Template Classes and Inheritance @@ -222,7 +231,7 @@ We can use `getTheBiggerOne` with our `Country` class just as well as our `Shape - Templates provide static polymorphism. I can define one function template that generates separate functions for each class. If I want to use my function with both `Shape` and `Country`, the compiler needs to know this at run time. - I can't declare a single function or class (such as a container), which can take both `Shape` and `Country`. For example, I can't put a `Shape` object in the same vector as a `Country` object, since it either needs to be a `vector` or `vector`. - If I use the function with `Shape` and with `Country` in the same program, I will actually generate two functions: `Shape& getTheBiggerOne(Shape&, Shape&)` and `Country& getTheBiggerOne(Country&, Country&)`. These functions are separate because they have different signatures (parameter and return types). -- These two can be combined. For example, `getTheBiggerOne` is a template which could be instantiated with the type `Shape`. The resulting fucntion, which takes and returns references to `Shape`, could be used with objects of type `Shape`, `Circle` or `Square` (run time polymorphism based on their inheritance tree) but not `Country` (this is not part of the same inheritance tree). +- These two can be combined. For example, `getTheBiggerOne` is a template which could be instantiated with the type `Shape`. The resulting function, which takes and returns references to `Shape`, could be used with objects of type `Shape`, `Circle` or `Square` (run time polymorphism based on their inheritance tree) but not `Country` (this is not part of the same inheritance tree). ## Organising and Compiling Code with Templates @@ -327,7 +336,7 @@ undefined reference to `int utilFunctions::add(int, int)' - The compiler has been unable to implement a definition of the `add` function for the type `int`, so this definition does not exist for us to use. - This error shows up during linking. You can compile both object files like before, because both match the template declaration and therefore are valid, but neither one can define the specific implementation that we want so when linking it finds that the function isn't defined anywhere. -- `implemenation.cpp` cannot define the implementation when compiled down to an object because it has the function template but not the intended type, so it can't come up with any concrete implementation. +- `implementation.cpp` cannot define the implementation when compiled down to an object because it has the function template but not the intended type, so it can't come up with any concrete implementation. - `usage.cpp` cannot define the implementation when compiled down to an object because it knows what type it should be used for, but it doesn't have the templated implementation (this is in `implementation.cpp`, and we have only included `declaration.hpp`). There are two possible ways to approach this problem. @@ -355,7 +364,7 @@ namespace utilFunctions ``` 2. We can keep our header file with just the declaration, and tell the compiler which types to implement the function for in the source file (`implementation.cpp`). - - In this case, `usage.cpp` will only be able to use `add` for the types which are explicitly instantiated in `implemenation.cpp`. + - In this case, `usage.cpp` will only be able to use `add` for the types which are explicitly instantiated in `implementation.cpp`. - This is less flexible as you need to anticipate any combination of template arguments that the function will be used with, but keeps the declaration and the implementation separate. - Separate function implementations will be created for each set of types given, even if they are never used. - It can also be useful if you want the function to restrict usage to a sub-set of possible types. diff --git a/05libraries/ProgrammingParadigms.md b/05libraries/ProgrammingParadigms.md new file mode 100644 index 000000000..e5d5a5a1e --- /dev/null +++ b/05libraries/ProgrammingParadigms.md @@ -0,0 +1,343 @@ +--- +title: Programming Paradigms +--- + +Estimated Reading Time: 40 Minutes + +# Programming Paradigms in C++ + +C++ is an uncommonly flexible programming language, supporting a range of different approaches to programming. On the one hand, this makes it difficult to get a feel for what "typical" C++ code looks like, but on the other it offer flexibility in dealing with problems. As C++ has evolved, it has incorporated features that appeared in a variety of other languages, particularly with regard to memory safety and functional features. + +In this section we'll talk about some programming paradigms, and the extent to which we can utilise them in C++. + +## Imperative Programming + +C++ is, first and foremost, an imperative programming language. Imperative programs are composed of _statements_, which when executed may change the program _state_. For example in the following program: +```cpp +int main() +{ + int x = 5; + int y = 3; + x = x + y; + + return 0; +} +``` +- The **state** is the value of the variables `x` and `y`. Initially these are `5` and `3`. Let's write this state as $[x = 5 ,\, y = 3]$. + - In principle we have access to more state than this, since we could access e.g. the system clock and so on, but we can ignore that because it's not used in this example. State can get very complicated to model when we start including external dependencies like clocks and I/O! +- The **statement** `x = x + 1` updates the state $[x = 5 ,\, y = 3 ] \rightarrow [x = 8 ,\, y = 3]$. It does this by first _reading_ the values of `x` and `y`, then _adding_ them, and then _writing_ the new value of `x` to memory. +- Note that `x = x + y` is an _assignment_, not an _equation_. There is an implied time ordering: the left hand side is assigned the _result_ of the execution of the right hand side, and thus the values of `x` on the left hand side (8) and right hand side (5) are different. + +By default in C++ variables are _mutable_, which means that their values can be changed. Imperative programming is usually centred around updating the values of mutable variables, i.e. updating the state. + +In imperative programming, statements may have _side effects_. For example if we call a function: +```cpp +int y = f(x); +``` +- We can see from the types here that `f(x)` must return an integer. +- In addition to this however, it could modify the values of any part of the state to which it has access, including `x` (if it is passed by reference) or any global variables which are not protected. + +As C++ programmers it is not uncommon to use library code written in C. It is not uncommon for C functions to produce all their meaningful outputs as side-effects, and only return an `int` value indicating either success or error. For example a typical signature might be: +```cpp +int MatrixInverse(double *Mat) +``` +- This kind of function will usually return `0` for success, and other numbers for different kinds of errors. +- The matrix `Mat` (in this case a array of `double` referenced by a pointer) will be updated in place, rather than being a return value. + +Side effects can make it difficult to reason about programs, because we must also model the state of the program. In general we can't know if a function will change part of the state without looking into its code, which isn't always available to us. + +### Procedural Programming + +Procedural programming is style of imperative programming where programs are broken up into procedure (i.e. function) calls. This is one approach to modularising code; C has no support for classes and object oriented programming, and so most C code is written procedurally. Although in C++ we have the option to write object oriented code, it's not _necessary_ to do so, and it can be worth thinking about when to use a _free function_ (a function not bound to any class) instead. + +- Classes which do nothing but wrap a single function are often an unnecessary complication to the code. +- Free functions do not require an object to be constructed to use, and so also save on pointless overhead if the object has no other purpose. + - `static` functions in classes, which belong to the entire class rather than an individual object, are treated essentially as free functions and can be called without any object of that class being declared. +- If a function should exist independently of an object, then it may be best to write it as a free function. +- Free functions can be called anywhere in the code which has access to the definition, and so can be a good way of code sharing. For example, multiple classes which are unrelated to each other can all call the same free function. + +### Object Oriented Programming + +Object oriented programming is an approach to programming in which functions and data are typically encapsulated in classes, and accessed through _instances_ called objects. Objects generally have a particular purpose, or are intended to reflect a concrete concept in a model. This kind of organisation, if well implemented, can make code easier to understand and more intuitive. + +- Classes should represent a single, clear idea. There's a bit of a judgement call to be made here: we don't want to define classes which have lots of unrelated behaviour or unrelated data elements, but it's also possible to find yourself creating too many classes which each do so _little_ that they don't really represent a worthwhile concept on their own. + - Some languages like Java require all code to be part of some class, and thus Java is a hotbed of this kind of design problem. Many OOP examples that you find which originate in Java involve declaring new classes which do very little, and these are sometimes translated directly into C++ examples. Particularly when reading about OOP design patterns, consider whether there are clearer and less wasteful ways to express them. +- Classes can be used to represent abstract data types which must fulfil specific properties (sometimes called "invariants"): for example, that a list is always sorted, a binary tree is always balanced, or that two variables are always related by some formula. Most OOP languages provide access specifiers, which can be combined with member functions, to protect data and enforce these rules. +- Class members should normally be functions which are inextricable from the class itself, or which require privileged access to the class (access to private members). + - For example, if one were to write a class to represent the abstract data type `Queue`, which is a first-in first-out list, then the class should represent the data held in the queue _and_ the methods to add and remove elements from the queue. It is the responsibility of the class methods to ensure that the rules of the queue are respected: data must be removed from the queue in the same order that they are added. + - If a function isn't necessary for the use of some type, then it should be e.g. a free function which takes an argument of that type instead. + - Any member functions that you add to a class increase the amount of code which could violate the class invariants, because they have free access to the member data. +- Inheritance is a way of expressing the type relationship that one type is a sub-type of another in OOP languages. +- Composition and aggregation (member variables and pointers) are ways of creating complex types from more basic component types. +- The design of classes, and the use of inheritance, composition, and aggregation, should reflect the abstract model of your type as well as you can. + +Take a binary tree as an example: +- A `TreeNode` in a tree can be a `Branch` (a node which has children) or a `Leaf` (a node with no children). This can be expressed by the inheritance relations `class Branch : public TreeNode` and `class Leaf : public TreeNode`, because `Branch` and `Leaf` are both kinds of `TreeNode`. +- A `Branch` has a value of some type, pointers to its children (`TreeNode` pointer types which could be `Branch` or `Leaf`), and usually a pointer to its parent (`TreeNode` pointer). These relationships are composition (the value) and aggregation (pointers): a `Branch` is made of up of these components and can make use of them, but is not itself any of these things. + +![image](images/TreeInheritanceComposition.png) + +## Influences from Functional Programming + +Functional programming is an alternative approach to imperative programming. Although C++ is not a functional language in the sense that Haskell or ML are, it has taken some influence from functional programming in the last decade or so, and we can try to enforce some of the functional style in C++ by applying some conventions. + +Functional programming is a style in which programs are composed of functions _in the mathematical sense_. Crucially this means that: +- A function's output depends only on its arguments, **not any external state**. +- A function produces only a return value, there are **no side effects**. + +In functional programming languages, variables are typically _immutable_, meaning they cannot be changed once their value has been assigned. This is closer to how we think of variables mathematically: if we solve an equation for $x$, the value of $x$ is not changing from line to line! It's just a symbolic representation of a (possibly as yet unknown) value. + +> If you want to know more about functional programming, some good examples of functional languages are Haskell (purely functional), ML (functional first, but with some support for imperative programming), and F# (similar to ML but with additional support for OOP). + +Although C++ variables _are_ mutable and do permit assignments by default, we can make immutable variables using the `const` keyword: + +> ```cpp +> const int x = 15; +> // x = x + 1; Not allowed: this will cause a compiler error! +> ``` + +Using `const` is a good way to use the compiler to enforce safety properties any time that you know there is a variable that should not change its value. Leveraging the compiler to check correctness conditions in your code is a key aspect of developing good C++ code. + +Not only does `const` make our code safer, but it can also make it _faster_. Compilers are often able to make additional optimisations when they know that variables are constants. + +### Pure Functions and `const` Correctness + +Pure functions mean functions with no side effects and no external state: they depend only on their arguments and their only output is their return value. + +Pure functions are useful in programming because they offer greater clarity; it is easier to reason about pure functions because they behave like mathematical functions. + +C++ is not really suited to pure functional programming, and we cannot stop a function from having access to global state or a member function from having access to the object's state. We can however move towards purer functions by preventing them from making changes to the state. + +There are a few things that we can do to make our functions purer: +- Pass variables by `const` reference, or by value. This way the original variables can't be modified by the function. +- If you want a member function in an object to be pure, then you should declare it `const`. A `const` member function cannot change the state of the object, i.e. all the member variables are treated as constants within the body of that function. + +You can declare a `const` member function by placing the keywords `const` after the function signature, as in the `Area` function in this `Circle` class: + +> ```cpp +> class Circle +> { +> public: +> Circle(double r) : radius(r) {} +> +> double Area() const +> { +> return M_PI * radius * radius; +> } +> +> private: +> double radius; +> std::array centre; +> }; +> ``` + +A pure function can be declared as a `constexpr`. `constexpr` stands for "constant expression", and it is an expression which can (in principle) be evaluated at compile-time. The simplest usages for this are to initialise constant variables with simple expressions, such as: + +> ```cpp +> double y = 1.0/7.0; +> constexpr double x = 1.0/7.0; +> ``` + +- `y` is calculated at runtime. +- `x` is calculated at compile-time and the resulting floating point number is inserted directly into the machine code. We don't need to do the division every time we run the program! +- Your compiler may be able to optimise the calculation of `y` for you, but there are cases that are less clear cut than this which your compiler might miss. + +A pure function can be a `constexpr` because it has no side effects and depends only on its inputs. It can be evaluated at compile time _if its arguments are known are compile time_. +Consider the following code snippet: + +> ```cpp +> int f(const int x) +> { +> return 4*x*x - 3*x + 2; +> } +> +> constexpr g(const int x) +> { +> return 4*x*x - 3*x + 2; +> } +> +> int main() +> { +> const int x = 6; +> int y = 5; +> +> int a = f(x); // Calculated at runtime +> int b = f(y); // Calculated at runtime +> int c = g(x); // Calculated at compile time +> int d = g(y); // Calculated at runtime +> int e = g(5); // Calculated at compile time +> } +> ``` + +- The function `f` if not a declared a `constexpr` even though it is pure. This will generally need to be executed at runtime for both const and non-const arguments (unless your compiler is able to optimise it). +- The function `g` is a `constexpr` and so can be calculated at runtime when its arguments are constant, i.e. for the cases `g(x)` and `g(5)` (since literals like `5` are compile time constants). + - These lines are therefore equivalent to writing `int c = 128;` and `int e = 87;` respectively. + +Using a `constexpr` can have a number of advantages: +- Code like `int c = g(x)` is more expressive than `int c = 128` because it makes the relationships between variables explicit. We know that if `x` changes then `c` should change as well. +- Now if we change `g` or `x`, `c` will be updated accordingly without us having to recalculate it and insert it into the code ourselves. +- We still save on runtime costs for anything which can be known at compile time. + +**N.B.** For simple examples like these your compiler will likely be able to optimise away these calculations at compile time if optimisations are turned on, because they are simple enough to be able to see that these variables don't change and that these functions are pure. However the more complex your code gets, the less able the compiler is to catch these for you, so you should still mark these kinds of functions are `constexpr`. It pays to help your compiler out a bit! + +Writing pure functions has some drawbacks: for example it makes sense to update large matrices and vectors in place to save space and time. You will need to decide when and where to use pure functions, but **you should always mark functions and arguments as `const`/`constexpr` where these are applicable**. This will allow the compiler to enforce `const` correctness, optimise your machine code, and prevent compilation of any code that violates the conditions which you have asserted. This also means that a fellow C++ programmer (or yourself, in the future!) can tell by looking at the signature of a function in a library that the function will not alter the state. This is extremely useful information for a user to know and will allow them to program more efficiently without having to check up on implementations to look for side effects. + +### Recursion + +A recursive function is a function which calls itself as part of its definition. In order to terminate, there needs to be some _base case_ or condition which is met so that it stops calling itself. For example: + +> ```cpp +> constexpr uint factorial(uint n) +> { +> return n == 0 ? 1 : n * factorial(n-1); +> } +> ``` + +- The function will return 1 if `n` is zero; this is the base case. +- Otherwise the function will return `n*factorial(n-1)`, which is the definition of a factorial. +- Recursive functions can still be pure, and still be `constexpr`. We can calculate, for example, `factorial(10)` at compile time and save ourselves the runtime function calls. +- Do note that this is just an illustrative example and there are more efficient ways to calculate factorials! + +Recursion is available in both functional and imperative programming languages, and has been a part of C++ since its inception, although in imperative styles it is more common to see loops doing the same job. (Some problems cannot be solved by loops however and _must_ use recursion, although this is comparatively rare.) + +Recursion can lead to intuitive programs with a more direct translation of mathematical definitions. It can in particular help when writing **divide and conquer** algorithms, where problems can be solved in terms of the solutions to sub-problems. + +Do bear in mind though that there can be overheads associated with the function calls, and a deep recursion can open a lot of stack frames eating up stack memory. It can nevertheless sometimes be helpful to write an easy to understand recursive solution first, and then write a more optimised version _afterwards_ once you have established that your solution works. Remember to use unit tests to check you haven't broken correctness in the process! + +### Higher Order Functions + +A major characteristic of functional programming is the use of _higher order functions_. These are functions which accept functions as arguments. + +In some programming languages functions are what are called "first class data", in other words functions can be declared and passed around just like ordinary variables. Although C++ was not designed with this in mind, the `` header of the standard library provides us with some extremely useful functionality to program in this way. +- `std::function<>` is a templated type which can represent a function with a given signature (input and output types). This can be a wrapper for an ordinary function, a lambda expression, or a callable object (a class with `operator()` overloaded). This makes it easy to pass functions as arguments to other functions. +- `std::bind` can be used to fix some of the arguments in a function to produce a new function of fewer arguments. +- A key higher order function is typically called _map_. Map takes a function $f$, which takes an input of type $t_1$ and returns an output of type $t_2$, and produces a new function $f_\text{map}$ which takes a **container of $t_1$ values** and returns a container of $t_2$ values by applying $f(x)$ to each value $x$ in the container. +- Function composition can be used to create new functions. This is particularly powerful when applied to e.g. _map_ since it allows more complex operations to mapped to entire lists, trees, or other iterable structures. + +This kind of functional approach can be extremely powerful and expressive when creating flexible and abstract programs. It is important to note a few downsides however: +- Functional programming constructs in C++ tend to incur overheads due to the large number of function calls and layers of wrappers which need to be processed. Compiling with optimisations turned on can help, but for performance critical code you may need to consider writing in a more traditional imperative way. +- Programming with functional features can make debugging more difficult as programs can be very confusing to step through. + +### A Function Passing Example + +An integrator is a good example of a function which would take another function as input. We can use `std::function`: + +> ```cpp +> double integrate(std::function f, double x_0, double x_1) +> ``` + +- This function takes a function `f` from `double` to `double`. + - e.g. `double square(double x) {return x*x;}` would be an appropriate function to pass. +- It also takes an upper and lower limit to integrate between +- It returns a double which should be $\approx \int_{x_0}^{x_1} f(x) \, dx$ +- This version can accept callable objects, lambda expressions, free functions, functions with bound parameters etc. +- Integration methods inevitably have to call the function `f` many times, which may lead to unacceptable overheads. + +Or we can use a function pointer: + +> ```cpp +> double integrate(double (*f)(double), double x_0, double x_1) +> ``` + +- This version can take a free function of type `double f(double)`, but not general `` types: no using `std::bind` or callable objects or lambda expressions with environment variable capture. You can however use a lambda expression which has no capture (empty square brackets such as `[](double x){return x*x;}`). +- This version will run faster but sacrifices some flexibility. For a method such as integration where the function is called in a hot loop it might be the best choice though in a performance critical context. + +We can partially explore this using some code which times the performance of three cases: +1. Using a function pointer method with a free function. +2. Using a function pointer method with a capture-free lambda. +3. Using an `std::function` method with a free function. + +> ```cpp +> #include +> #include +> #include +> +> using namespace std; +> +> double integrate_functional(const std::function f, +> const double x_0, const double x_1, const size_t N) +> { +> const double delta_x = (x_1 - x_0) / N; +> double tot{0}; +> for(size_t i = 0; i < N; i++) +> { +> double x = x_0 + (i + 0.5) * delta_x; +> tot += f(x) * delta_x; +> } +> return tot; +> } +> +> double integrate_pointer(double (*f)(double), const double x_0, +> const double x_1, const size_t N) +> { +> const double delta_x = (x_1 - x_0) / N; +> double tot{0}; +> for(size_t i = 0; i < N; i++) +> { +> double x = x_0 + (i + 0.5) * delta_x; +> tot += f(x) * delta_x; +> } +> return tot; +> } +> +> double quadratic(double x) +> { +> return x*x; +> } +> +> int main() +> { +> const size_t N = 1'000'000'000; +> +> auto t_1 = chrono::high_resolution_clock::now(); +> double v_ptr = integrate_pointer(quadratic, 0, 1, N); +> auto t_2 = chrono::high_resolution_clock::now(); +> cout << "Value pointer = " << v_ptr << endl; +> cout << "Time pointer = " << +> (double) chrono::duration_cast(t_2 - t_1).count() * 1e-6 << endl; +> +> t_1 = chrono::high_resolution_clock::now(); +> double v_ptr_lam = integrate_pointer([](double x){return x*x;}, 0, 1, N); +> t_2 = chrono::high_resolution_clock::now(); +> cout << "Value pointer lambda = " << v_ptr_lam << endl; +> cout << "Time pointer lambda = " << +> (double) chrono::duration_cast(t_2 - t_1).count() * 1e-6 << endl; +> +> t_1 = chrono::high_resolution_clock::now(); +> double v_fun = integrate_functional(quadratic, 0, 1, N); +> t_2 = chrono::high_resolution_clock::now(); +> cout << "Value functional = " << v_fun << endl; +> cout << "Time functional = " << +> (double) chrono::duration_cast(t_2 - t_1).count() * 1e-6 << endl; +> +> return 0; +> } +> ``` + +The results with optimisation off and on: + +> ``` +> $ g++ -o int integration.cpp +> $ ./int +> Value pointer = 0.333333 +> Time pointer = 8.67147 +> Value pointer lambda = 0.333333 +> Time pointer lambda = 10.5195 +> Value functional = 0.333333 +> Time functional = 11.7649 +> +> $ g++ -o int integration.cpp -O3 +> $ ./int +> Value pointer = 0.333333 +> Time pointer = 0.8935 +> Value pointer lambda = 0.333333 +> Time pointer lambda = 0.883614 +> Value functional = 0.333333 +> Time functional = 2.19115 +> ``` + +- We can see that with optimisations off a function pointer to a free function performs best, followed by a pointer to a capture-free lambda, and finally an `std::function` method. +- With optimisations turned on the function pointer method produces almost identical timings with a free function or the lambda. The `std::function` approach, although considerably sped up from before, now lags behind at more than double the time of the other method. +- This is just one example and **should not be taken as universally indicative**. You should try things out and time them for yourself, especially when developing performance critical code. +- The function we are integrating is extremely simple, and therefore function call overheads will dominate in this example more than they usually would. When integrating a more complex function, the performance difference may be smaller. +- Sometimes flexibility and intuitive code are more useful than speed! Always keep your priorities clearly in mind, and don't assume you _always_ need the fastest possible program just for the sake of it. diff --git a/05libraries/index.md b/05libraries/index.md index 361b89060..3148dc35b 100644 --- a/05libraries/index.md +++ b/05libraries/index.md @@ -1,14 +1,11 @@ --- -title: "Week 5: Libraries" +title: "Week 5: Code Design and Programming Paradigms" --- ## Week 5: Overview -* What are libraries? Why use them? -* How to choose a library - * Licensing, longevity, developer community, technical implementation, **feature list** etc. -* Working with libraries - * Including them - * C++ concepts - * Not an exhaustive product specific tutorial +This week we will consider code design in C++ in a more general sense. We'll encounter some common coding patterns that appear frequently in object oriented code, explore some of the different programming paradigms that exist and how they have influenced C++, and work towards writing good quality code. +1. [Designing Classes](sec01DesigningClasses.html) +2. [Programming Paradigms](ProgrammingParadigms.html) +3. [Code Design](sec03CppCodeDesign.html) diff --git a/05libraries/sec00Intro.md b/05libraries/sec00Intro.md deleted file mode 100644 index 0dec2687b..000000000 --- a/05libraries/sec00Intro.md +++ /dev/null @@ -1,35 +0,0 @@ ---- -title: Libraries ---- - -## Why use Libraries? - -> The best code is the code you never write - -### What are libraries? - -- Libraries are collections of useful classes and functions, ready to use -- C++ libraries can be somewhat harder to use than modules in other languages (e.g. Python) -- Can save time and effort by providing well-tested, flexible, optimised features - -### Libraries from a scientific coding perspective - -Libraries help us do science faster - -- Write less code (probably) -- Write better tested code (probably) -- Write faster code (possibly) - -Particular things we scientists don't ever want to build ourselves: - -- standard data structures (e.g. arrays, trees, linked lists, etc) -- file input/output (both for config files and output files) -- standard numerical algorithms (e.g. sorting, linear solve, FFT, etc) -- data analysis and plotting - -Sometimes we have to build things ourselves, when: - -- a library isn't fast enough -- we don't trust a library's results/methods -- a library doesn't provide the needed functionality -- we can't use a library due to licensing issues diff --git a/04cpp3/sec01DesigningClasses.md b/05libraries/sec01DesigningClasses.md similarity index 99% rename from 04cpp3/sec01DesigningClasses.md rename to 05libraries/sec01DesigningClasses.md index f966dc8ba..23997fde0 100644 --- a/04cpp3/sec01DesigningClasses.md +++ b/05libraries/sec01DesigningClasses.md @@ -251,7 +251,7 @@ One way that we can make use of this kind of class structure is to be able to se ## Example: Factory Pattern -When dealing with abstract classes it is sometimes useful to be able to make objects of different sub-classes depending on runtime considerations. In this case, we can define another class or method, sometimes known as a "factory", which returns something of the base type. Let's say we have a system that allows a person to register with the University as either a `Student` or an `Employee`, both of which inherit from a generic `Person` class. Whether or not we create `Student` or `Employee` object will depend on the input that the person gives us, which we cannot know before run time. We can then create a class or function which returns a `Person` type, but which, depending on the information input, may create a `Student` or `Employee` object and return that. +When dealing with abstract classes it is sometimes useful to be able to make objects of different sub-classes depending on runtime considerations. In this case, we can define another class or method, sometimes known as a "factory", which returns something of the base type. Let's say we have a system that allows a person to register with the University as either a `Student` or an `Employee`, both of which inherit from a generic `Person` class. Whether or not we create `Student` or `Employee` object will depend on the input that the person gives us, which we cannot know before run time. We can then create a class or function which returns a `Person` type, but which, depending on the information input, may create a `Student` or `Employee` object and return a pointer to that. ## Implementing Multiple Interfaces diff --git a/05libraries/sec03CppCodeDesign.md b/05libraries/sec03CppCodeDesign.md new file mode 100644 index 000000000..57fceb7d9 --- /dev/null +++ b/05libraries/sec03CppCodeDesign.md @@ -0,0 +1,130 @@ +--- +title: C++ Code Design Summary +--- + +# C++ Code Design Summary + +This is the end of our coverage of basic C++ language features. In the coming weeks, we'll explore different programming strategies, tools such as debuggers and profilers, and writing performant code using optimisations and parallel programming. Now is a good time to reflect on some of the features that we've learned about and how they fit together. + +## General C++ Principles + +- Separate function / class declarations and implementations into header (.h) and source (.cpp) files. +- Use smart pointers for data-owning pointers: manual memory management should be minimised. +- Check for standard implementations of functions before writing your own: things like sorting are already well covered! +- The standard library offers performant containers such as `vector`, `array`, and `map`. +- Make use of modern C++ features like range based loops, `auto` type inference, and anonymous functions where they make your code easier to understand or more flexible. + - Be aware of possible performance issues with anonymous functions / `std::function` due to calling overheads. + - Don't use `auto` if it makes it difficult for people to understand what types you are using. +- Don't import entire large namespaces like `std` as they risk name clashes. +- Code should be modularised: + - Functions should achieve a single task. + - Classes should bundle together data and functionality necessary to represent a single concept. + - Use unit-testing to test individual pieces of your program independently. + - If you start repeating yourself in your code, try to refactor so that repeated segments are replaced with function calls. +- Make use of features like intefaces and templates for flexible and reusable code. +- Programming solutions are not one size fits all: think carefully about your problem, the use case that you are developing for, and how you feel you can best serve your priorities and reflect the logical structure of your model in C++. + +## Run-time and Compile-time Polymorphism + +Now that we've met both inheritance based run-time polymorphism and generic programming through templates, it's worth looking at the similarities and differences between the two. +- Polymorphism allows for differences in behaviour to be decided at run-time. + - Behavioural differences are encoded inside classes which are related by inheritance. + - A single polymorphic function can operate on a base type and all its derived types. This is usually achieved by passing a pointer to the base type and calling virtual functions. +- Templates (and function overloading) allow for differences in behaviour to be decided at compile-time. + - Behavioural differences are encoded into the external functions (or classes) which make use of the templated or overloaded type. The types which can be used aren't generally related by inheritance, but merely need to fulfil the functionality demanded in the templated code. + - Templates generate separate classes / functions for every different template parameter it is called with. + +There is a difference between a code which needs to have different behaviour with different objects not knowing ahead of time the exact type of that object (run-time polymorphism) and code which can be applied to different types in different parts of the program, but does not require those types to be substitutable at run-time (compile-time polymorphism e.g. templates and function overloading). For example, you may well use the (overloaded) `+` operator to add integers together, and to concatenate strings, but you are unlikely to process data which could be _either_ an int or a string without knowing which it will be. + +## Composition and Inheritance + +We've seen this week that we can use multiple inheritance to implement multiple interfaces, which can lead to difficulties like the diamond problem, as well as making our model increasingly complex. While we can in fact inherit from an arbitrary number of base classes, it risks collisions between class namespaces and general confusion over the purpose and nature of an object. Multiple inheritance should only be used when motivated by genuine substitution (an "is-a" relationships, one class is a sub-type of the other) and a meaningful polymorphic use case. If faced with a multiple inheritance use case, consider whether it should in fact be represented as a chain of inheritance, or whether the functionality should in fact be refactored into a composition instead. + +Inheritance is sometimes misused by C++ programmers to share functionality between classes where composition would be clearer and more effective. Composition representing functionality is particularly powerful when combined with templates as we can still write a single piece of code which can be re-used with many types. + +- Classes with overlapping functionality don't necessarily need to be related by some base class. +- Classes should only be related by inheritance **if these classes should be interchangeable at some level** (i.e. can be substituted into the same place) in your code. For example, if we need a container such as a `vector` to be able to store and iterate over a diverse set of objects which are related by a core set of properties defined in a base class. +- Mere sharing of functionality can often be better represented by wrapping said functionality in a class and including it in your other classes by composition. + - For example many classes will need to store data in a container such as a `vector`, but that does not mean they should inherit from the container class! They should have an instance of that container where they can store their data. +- Multiple inheritance is generally limited to implementing two distinct, usually abstract, interfaces. An example of multiple inheritance in the C++ standard library is `iostream` (input/output stream) inherits from `istream` (input stream) and `ostream` (output stream). (See [`iostream` documentation](https://cplusplus.com/reference/istream/iostream/) and [C++ core guidelines on multiple inheritance](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rh-mi-interface).) +- Inheritance is good at defining what a class _is_, but you can use composition for things that your class _makes use of_. + +## Undefined Behaviour + +A quirk of the C++ programming language is that not all source code that compiles is actually a valid C++ program. **Undefined behaviour** refers to situations in C++ where the standard offers no guidance and a compiler can more or less do what it likes; as a result we as programmers may have little idea what will happen if such a program is run, and the results will vary from compiler to compiler, and system to system. This means if our program has undefined behaviour then even if we have thoroughly tested it on our own system, it may not be portable to anyone else's. + +You can read more about undefined behaviour on e.g. [cppreference](https://en.cppreference.com/w/cpp/language/ub). + +Much undefined behaviour centres around memory access or modification, for example: +- Reading values outside the bounds of an array. +- Reading / modifying an variable with a pointer of a different type (also known as type aliasing). + - There is a special exception for the type `char` or `std::byte` which allow us to observe any variable / object data as a sequence of bytes. + - You can read about [type aliasing here](https://en.cppreference.com/w/cpp/language/reinterpret_cast#Type_aliasing) which will also describe the concept of _similar types_ which we will not get into in these notes! +- Modifying a `const` value through a non-const pointer. +- There are many more causes of undefined behaviour, but note that many involve doing something to invalidate some aspect of the program's definition: subverting the type system, undermining `const` declarations, accessing private members and so on. It's usually not the case that you will _accidentally_ cause undefined behaviour, but rather it is often because of attempts to use low-level access to memory to get around a high-level construct. + +Undefined behaviour is often the consequence of the meeting of C++'s lower level and higher level features in ways that are not valid. We will just give one simple example to illustrate why this kind of behaviour ends up being undefined: that of undermining `const`. + +Consider the following code: what will it do? +```cpp +#include + +int main() +{ + const int N = 10; + + // Pointer ptr is a non const pointer to non const data + // It is initialised to point to the same address as N is stored + int *ptr = (int *) &N; + *ptr += 1; + + std::cout << N << std::endl; + std::cout << *ptr << std::endl; + std::cout << ptr << " " << &N << std::endl; + + return 0; +} +``` +The behaviour here is **undefined** since we have used an incompatible pointer type to read and modify the memory which contains the constant integer `N`. + +On my machine, the output is as follows: +``` +10 +11 +0x7fffd87014cc 0x7fffd87014cc +``` +- From the third line we can see that the storage address for `N` is the same as that pointed to by `ptr`. +- `N` is reported as `10`, but `*ptr` is `11`, which appears inconsistent! +- This is because my compiler has been told that `N` is a constant, and so in the line `std::cout << N << std::endl` the (time expensive) memory read is replaced by a hard-coded value `10`, which is more efficient and the compiler will assume is valid _because we told it so_. +- When printing out the value that the pointer is pointing to however, the memory read is necessary so we get the value `11`, which is the value which is actually stored in RAM. +- If we were to use `N` again later in the program, what value we would get would simply depend on whether the compiler optimised out the data read or not! +- Other compilers may do different things under different circumstances - there are no guarantees! + +Part of the price we pay for having this low level memory access is that it is possible to access memory in ways that violate the conditions that we have already stated: we can also set a pointer to look at any given location in memory (that our program has access to), which means it can be set to read or even modify `const` values, `private` members, variables of other types and so on. But in order for the compiler to do its best job, it needs to be able to make assumptions about the behaviour of the program and integrity of data, as we've seen with the above `const` violation example. + +- Make good use of high level concepts like the type system, `const`, and access specifiers to make your program safer and more expressive. In almost all programming circumstances these things will allow the compiler to catch any violations of your model and prevent them from compiling. +- **Don't do daft things with low-level memory** to undermine that safety: in C++ _you have some responsibility to make use of the language properly_. +- Undefined behaviour can be hard to catch because compilers will not necessarily catch or even issue a warning for undefined behaviour. (The above example for example will only issue a warning if compiled with the rather niche `-Werror=cast-qual` flag. Even the `-Wall`, "all warnings", and `-Wextra` flags will not be enough to catch this one!) +- Do learn about some of the causes of undefined behaviour. + +## Useful References + +### C++ Core Guidelines + +There are many differing opinions about what exactly constitutes "good practice" in C++, but a good place to start looking is generally the [C++ core guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines). + +These guidelines are co-written by the original designer of C++ and are quite extensive, but you can select individual topics to explore when you are unsure of things. + +The following books are also useful, and available through UCL library services. + +### A Tour of C++ + +The book [A Tour of C++](https://www.stroustrup.com/tour2.html) by Bjarne Stroustrup (one of the designers of C++) is a good, practical introduction to the major features of C++. The second edition is up to C++17, and the third edition covers up to C++20. You can check towards the back of the book what features become available in which C++ standard, so you can make sure you stay compatible with your compiler! + +### Effective Modern C++ + +The book [Effective Modern C++](https://www.oreilly.com/library/view/effective-modern-c/9781491908419/) is a good introduction to C++ up to the C++14 standard, and may be of help if you want to spend more time working on your C++ fundamentals. (Most, but by no means all, of the features that we have covered in this course are present in C++14.) + +### Design Patterns + +The book [Design Patterns](https://www.oreilly.com/library/view/design-patterns-elements/0201633612/) provides many examples of frequently occurring design solutions in object oriented programming that we have not covered in these notes. If you're comfortable with the ideas we've covered in C++ and want to improve your object-oriented software engineering skills, this book may be helpful. diff --git a/06tooling/index.md b/06tooling/index.md index 45370ee52..8a438df4c 100644 --- a/06tooling/index.md +++ b/06tooling/index.md @@ -1,53 +1,47 @@ --- -title: "Week 6: Tooling" +title: "Week 6: Libraries and Tooling" --- -## Unit testing +This week we'll learn about some of the tools that we can use to improve our code, and how to link with external libraries. -Testing single functions, methods or classes is referred to as **unit testing**, i.e. testing single *units* of code. We've already seen some unit tests earlier in this course written with the unit testing framework [Catch2](https://github.com/catchorg/Catch2) so you should have some understanding how to write, compile and run unit tests. However, like many things in programming, there is an art to writing *good* unit tests that provide enough **test coverage**, that is the amount of code tested by unit tests. To dig deeper into the philosophy of testing C++ code, we recommend the following talk from the 2022 Cppcon: +You should look over and install the following tools, and familiarise yourself a little with the timing statements available in C++. -{% include youtube_embed.html id="SAM4rWaIvUQ" %} +1. [Timing and Tooling](sec00TimingAndTooling.html) -## Debugging inside VSCode +## Why use Libraries? -We can debug our code from inside VSCode but it requires a little setup to make sure we're correctly using CMake when debugging. Follow [this tutorial to set up your VSCode properly with CMake](https://code.visualstudio.com/docs/cpp/CMake-linux). +> The best code is the code you never write -## Debugging memory issues with Valgrind +### What are libraries? -If you're unlucky enough to have to resort to unsafe memory management with raw pointers, you will almost certainly meet a **segmentation fault** or segfault, if your program tries to access memory it doesn't strictly have access to. This can happen due to many different types of bugs; stack overflows, freeing already freed pointers, off-by-one bugs in loops, etc, but can be notoriously tricky to debug. +- Libraries are collections of useful classes and functions, ready to use +- C++ libraries can be somewhat harder to use than modules in other languages (e.g. Python) +- Can save time and effort by providing well-tested, flexible, optimised features -Valgrind is a **memory profiler and debugger** which can do many useful things involving memory but we just want to introduce its ability to find and diagnose segfaults by tracking memory allocations, deallocations and accesses. +### Libraries from a scientific coding perspective -You should follow [Valgrind's Quickstart Guide](https://valgrind.org/docs/manual/quick-start.html). +Libraries help us do science faster -## Linting with clang-tidy +- Write less code (probably) +- Write better tested code (probably) +- Write faster code (possibly) -**Linters** are tools that statically analyse code to find common bugs or unsafe practices. We'll be playing with the linter from the Clang toolset, `clang-tidy` so follow this tutorial on setting up clang-tidy with VSCode: +Particular things we scientists don't ever want to build ourselves: -{% include youtube_embed.html id="8RSxQ8sluG0" %} +- standard data structures (e.g. arrays, trees, linked lists, etc) +- file input/output (both for config files and output files) +- standard numerical algorithms (e.g. sorting, linear solve, FFT, etc) +- data analysis and plotting -## Formatting with clang-format +Sometimes we have to build things ourselves, when: -If you've done much Python programming you probably already know the power of good formatters, tools that reformat your code to a specification. This can help standardise code style across codebases and avoid horrid debates about spaces vs tabs, where curly braces should go, or how many new lines should separate functions. +- a library isn't fast enough +- we don't trust a library's results/methods +- a library doesn't provide the needed functionality +- we can't use a library due to licensing issues -Again, we'll be peeking into the Clang toolbox and using `clang-format` to automatically format our code. Follow [this guide on setting up a basic .clang-format file](https://leimao.github.io/blog/Clang-Format-Quick-Tutorial/) and see clang-format's [list of common style guides](https://clang.llvm.org/docs/ClangFormatStyleOptions.html#basedonstyle) for more information about what styles are available. Look at a few, choose one you like and use that style to format your assignment code. - -You can also use [the clang-format VSCode extension](https://marketplace.visualstudio.com/items?itemName=xaver.clang-format) to automatically format your code on saving. - -## Performance profiling with gprof - -As we move towards writing *performant* C++, one essential tool is a **profiler**, a tool that runs your code and measures the time taken in each function. This can be a powerful way to understand which pieces of your code need optimising. - -There are many advanced profilers out there but a good, simple profiler is `gprof`. Watch this introductory video on using gprof: - -{% include youtube_embed.html id="zbTtVW64R_I" %} - -Now try profiling one of your own codes. Since we're using cmake, we can't directly add the required `-pg` flags to the compiler so we'll have to tell cmake to add those flags with: - -``` -cmake -DCMAKE_CXX_FLAGS=-pg -DCMAKE_EXE_LINKER_FLAGS=-pg -DCMAKE_SHARED_LINKER_FLAGS=-pg ... -``` - -## Compiler warnings - -One of the easiest ways to improve your code is to turn on **compiler warnings** and fix each warning. Some companies even require that all compiler warnings are fixed before allowing code to be put into production. Check out [this blog post on Managing Compiler Warnings with CMake](https://www.foonathan.net/2018/10/cmake-warnings/) for details on how to do this in our CMake projects. I recommend you use these warnings to fix potential bugs in your assignment. +1. [Choosing Libraries](sec01ChoosingLibraries.html) +2. [Library Basics](sec02LibraryBasics.html) +3. [Linking Libraries](sec03LinkingLibraries.html) +4. [Installing Libraries](sec04InstallingLibraries.html) +5. [Libraries Summary](sec05Summary.html) diff --git a/06tooling/sec00TimingAndTooling.md b/06tooling/sec00TimingAndTooling.md new file mode 100644 index 000000000..df69fcd00 --- /dev/null +++ b/06tooling/sec00TimingAndTooling.md @@ -0,0 +1,146 @@ +--- +title: Timing and Tooling +--- + +This week we'll look a bit at how to time our code for performance, and also introduce a number of tools which we can use to develop and improve our code. We'll start using these in the practical but please make sure you install and set up the tools ahead of time, and reach out to us before class if you have trouble doing so. + +## Timing + +Timing statments can be inserted into the code usind the `` header from the standard library. + +`std::chrono` allows us to time code using a number of different clocks: +- `system_clock`: This is the system wide real time clock. + - Be aware that system time can be adjusted while your program is running (e.g. by admin) which would interfere with measurements. +- `steady_clock`: A monotonically increasing clock which cannot be adjusted. Good for measuring time differences, but won’t give you the time of day when something happens. +- `high_resolution_clock`: The clock with the shortest ’tick’ time available on the system - this may be system_clock or steady_clock or neither, so the properties of this clock are system dependent. A useful choice when your primary concern is precision (e.g. timing functions which are fairly short), as long as you can be confident that your clock won't be alterned during the run. + +We can see an example of how to do timing using the following code fragment: + +```cpp +#include +#include + +typedef std::chrono::steady_clock timingClock; + +void my_function() +{ + // your interesting code here +} + +int main() +{ + std::chrono::time_point t_start = timingClock::now(); + my_function(); + std::chrono::time_point t_end = timingClock::now(); + + std::chrono::nanoseconds diff = t_end - t_start; + + std::chrono::microseconds duration = std::chrono::duration_cast(diff); + + double seconds = static_cast(duration.count()) * 1e-6; + + std::cout << "Time taken = " << seconds << std::endl; +} +``` +- We can take the time at a given point using the `now()` function on a particular clock. +- Take these times as close on either side of the thing that you want to measure as possible. Don't put additional code (especially slow things like I/O) inside your timing statments unless you want them to contribute to yours times! +- `now()` returns a `time_point` type. The difference of two `time_points` is a duration type, which we can cast between different units such as `nanoseconds` and `microseconds`. +- We can convert a duration type to a numerical using `count()`, which by default is an integral type. This can be cast to a floating point type such as `double` if you want to use the time with fractional artihmetic. + +Some things to note about this code: +- As you can see, the types in `chrono` are quite verbose due to the amount of namespaces! +- I very strongly recommend using `typedef` or `using` statments to reduce clutter (the code would have been _even longer_ if we haven't created the type alias `timingClock`). +- I've written out the types of everything explicitly here so that you can see what types each of these functions returns, but in practice (once you're familiar with the way that `chrono` works) this can be a good place to use `auto` to de-clutter. +- This code will work for any kind of clock, so you can change the clock you are using by simply changing the `typedef` statement at the top and leaving the rest of the code unchanged. + + +A more succinct version of this code might look like: +```cpp +#include +#include + +typedef std::chrono::steady_clock timingClock; + +void my_function() +{ + // Your interesting code here +} + +int main() +{ + auto t_start = timingClock::now(); + my_function(); + auto t_end = timingClock::now(); + + std::chrono::nanoseconds diff = t_end - t_start; + + double seconds = static_cast(diff.count()) * 1e-9; + + std::cout << "Time taken = " << seconds << std::endl; +} +``` +- `auto` saves writing and reading complicated types for a function which is obvious. +- It's not obvious that a difference in two timepoints will default to nanoseconds, so that type should (in my opinion) be kept in the code for clarity. This is particularly true since it's necessary to see that the calculation of `seconds` is correct. + - We can also use other types. `std::chrono::nanoseconds` is actually an alias for the type `std::chrono::duration`. The first template parameter is the type that you want your `count` to go into (which can be integral or floating point types), and the second is essentially our units (`std::nano`, `std::micro`, `std::milli` etc.). You can use your own template instantiation `std::chrono::duration` if you want to skip the `static_cast`. + - You can use `std::nano` in combination with integral types (`int`, `int64_t`, `size_t`, `uint` etc.) or floating point types (`float`, `double` etc.). You can use `std::micro` with floating point types but if you want to have an integral count representing the number of microseconds then you need to a duration in nanoseconds first and then do a `duration_cast`. + - **Since there are so many options here, it's a good idea to just tell people what type you're using!** +- We showed above that we _can_ convert to microseconds and so on, but we don't have to do so, we can work directly with `nanoseconds` if that's useful. +- It can also be a good idea to just wrap up some timing code in a little class so you can reuse it across projects and don't have to keep thinking about all this stuff. + +## Tooling + +**N.B.** Please remember that if you are using Windows for this course you will need to install these tools **inside WSL** (Windows Subsystem for Linux) rather than following a normal Windows installation. To do so, you can +1. Open a Command Prompt and type `wsl` to go into the WSL command line. From there you can follow Linux instructions for installing command line tools like Valgrind. +2. Open VSCode and [connect to WSL using the button in the bottom left hand corner](https://code.visualstudio.com/docs/remote/wsl). From there you can add extensions to VSCode, or open a terminal to access the WSL command line and install command line tools. + +## Debugging inside VSCode + +We can debug our code from inside VSCode but it requires a little setup to make sure we're correctly using CMake when debugging. Follow [this tutorial to set up your VSCode properly with CMake](https://code.visualstudio.com/docs/cpp/CMake-linux). + +## Debugging memory issues with Valgrind + +If you're unlucky enough to have to resort to unsafe memory management with raw pointers, you will almost certainly meet a **segmentation fault** or segfault, if your program tries to access memory it doesn't strictly have access to. This can happen due to many different types of bugs; stack overflows, freeing already freed pointers, off-by-one bugs in loops, etc, but can be notoriously tricky to debug. + +Valgrind is a **memory profiler and debugger** which can do many useful things involving memory but we just want to introduce its ability to find and diagnose segfaults by tracking memory allocations, deallocations and accesses. + +You should follow [Valgrind's Quickstart Guide](https://valgrind.org/docs/manual/quick-start.html). + +## Linting with clang-tidy + +**Linters** are tools that statically analyse code to find common bugs or unsafe practices. We'll be playing with the linter from the Clang toolset, `clang-tidy` so follow this tutorial on setting up clang-tidy with VSCode: + +{% include youtube_embed.html id="8RSxQ8sluG0" %} + +## Formatting with clang-format + +If you've done much Python programming you probably already know the power of good formatters, tools that reformat your code to a specification. This can help standardise code style across codebases and avoid horrid debates about spaces vs tabs, where curly braces should go, or how many new lines should separate functions. + +Again, we'll be peeking into the Clang toolbox and using `clang-format` to automatically format our code. Follow [this guide on setting up a basic .clang-format file](https://leimao.github.io/blog/Clang-Format-Quick-Tutorial/) and see clang-format's [list of common style guides](https://clang.llvm.org/docs/ClangFormatStyleOptions.html#basedonstyle) for more information about what styles are available. Look at a few, choose one you like and use that style to format your assignment code. + +You can also use [the clang-format VSCode extension](https://marketplace.visualstudio.com/items?itemName=xaver.clang-format) to automatically format your code on saving. + +## Compiler warnings + +One of the easiest ways to improve your code is to turn on **compiler warnings** and fix each warning. Some companies even require that all compiler warnings are fixed before allowing code to be put into production. Check out [this blog post on Managing Compiler Warnings with CMake](https://www.foonathan.net/2018/10/cmake-warnings/) for details on how to do this in our CMake projects. I recommend you use these warnings to fix potential bugs in your assignment. + +## Optional: Performance profiling with gprof + +Although you won't be required to use one on this course, as we move towards *performant* C++, one useful tool is a **profiler**. This is a tool that runs your code and measures the time taken in each function. This can be a powerful way to understand which parts of your code need optimising. + +There are many advanced profilers out there but a good, simple profiler is `gprof`. This also has the advantage of coming with most Linux distributions, so is automatically available with Ubuntu on either a native Linux machine or WSL. + +You can watch this introductory video on using gprof: + +{% include youtube_embed.html id="zbTtVW64R_I" %} + +and try profiling one of your own codes. Since we're using cmake, we can't directly add the required `-pg` flags to the compiler so we'll have to tell cmake to add those flags with: + +``` +cmake -DCMAKE_CXX_FLAGS=-pg -DCMAKE_EXE_LINKER_FLAGS=-pg -DCMAKE_SHARED_LINKER_FLAGS=-pg ... +``` + +On MacOS you can try using Google's [gperftools](https://github.com/gperftools/gperftools) which is available through homebrew. + +- You should target the areas of your code where your application spends the most time for optimisation. +- Profilers are excellent for identifying general behaviour and bottlenecks, but you may be able to get more accurate results for specific functions or code fragments by inserting timing code. + diff --git a/05libraries/sec01ChoosingLibraries.md b/06tooling/sec01ChoosingLibraries.md similarity index 76% rename from 05libraries/sec01ChoosingLibraries.md rename to 06tooling/sec01ChoosingLibraries.md index ad6162693..08b717623 100644 --- a/05libraries/sec01ChoosingLibraries.md +++ b/06tooling/sec01ChoosingLibraries.md @@ -42,7 +42,7 @@ Remember: even if you aren't distributing code yet, you need to understand the l When you distribute your code, the licenses of any libraries you use takes effect. For example, a library with license: -* [MIT][MITLicense] or [BSD][BSDLicense] are permissive. So you can do what you want, including sell it on. +* [MIT][MITLicense] or [BSD][BSDLicense] are permissive. So you can do what you want with the resulting software you write, including sell it on. * [Apache][ApacheLicense] handles multiple contributors and patent rights, but is basically permissive. Some libraries can affect how you yourself must license your code: @@ -67,7 +67,7 @@ For an in-depth understanding we recommend you read some works about licenses: **Note**: Once a 3rd party has your code under a license agreement, their restrictions are determined by that version of the code. -### Stability: Is a license stable? +### Stability: Is a library stable? Some libraries are so new their *public API* or *interface* is still subject to change. This is usually signalled by the project being in *alpha* or *beta* stages, either before an initial 1.0.0 release, or before a new major x.0.0 release. Some projects (like Python itself) ensure that all *minor* versions will not intentionally introduce breaking changes (i.e. you can use the same code moving from 3.10 to 3.11) but keep *breaking changes* to new major versions (i.e. moving from Python 2 to Python 3). If you haven't come across this idea, read about [semantic versioning](https://www.geeksforgeeks.org/introduction-semantic-versioning/). @@ -75,9 +75,9 @@ When choosing a library to use with your own project, try to use a *stable* vers ### Efficiency: Is the library fast enough? -Libraries, particularly the good ones, tend to be well-optimised, that is their algorithms and data structures have been tweaked to get the best performance. For performance-critical libraries (like many used in numerical computations) the library developers should include some details about the performance of the library in its documentation. This is where you should ideally look for information about the performance. Otherwise, try to find comparisons with other, similar libraries to understand the performance. +Good scientific libraries tend to be _well-optimised_, that is their algorithms and data structures have been designed in an attempt to get the maximise performance for the functionality that they want to provide. For performance-critical libraries (like many used in numerical computations) the library developers should include some details about the performance of the library in its documentation. This is where you should ideally look for information about the performance. Otherwise, try to find comparisons with other, similar libraries to understand the performance. Try to understand your own needs when looking at performance: optimisation can mean trade-offs between being fast, being memory efficient, or being flexible. (Recall that our highly flexible `std::function` integrator was slower than our more narrow function pointer version.) -While you are unlikely to beat a library's performance with a custom algorithm, sometimes custom code can be faster due to a tradeoff between flexibility and performance. If you have already used the library but think you might be able to beat a library's performance: +Many popular libraries have been researched, developed, and maintained over many years, with intense focus placed on the correctness and performance of their algorithms. As such, you are unlikely to beat a mature library's performance with a like-for-like custom algorithm, but sometimes custom code can be faster due to a tradeoff between flexibility and performance. (Libraries, in order to be useful to large numbers of people, often provide fairly general methods, which can sometimes be improved upon using detailed knowledge of your precise problem.) If you have already used the library but think you might be able to beat a library's performance: 1. test the performance of the library's implementation 2. write some unit tests using the library's implementation @@ -97,24 +97,24 @@ Libraries that are not regularly maintained can "rot", that is: - newer, safer language features don't get introduced - advances in packaging make it more difficult to install -In general though, we want to avoid these issues, so consider these questions when deciding if the library is suitably up-to-date: +In general, we want to avoid these issues, so consider these questions when deciding if the library is suitably up-to-date: * When was the last release? * Is there a sensible versioning scheme (e.g. [semantic versioning][semver])? * Is a [changelog][changelog] provided with each new release? * Is there a suitable release schedule? -* Is the code developed on the open (e.g., on GitHub)? +* Is the code developed in the open (e.g., on GitHub)? * How often are there commits? -You should develop your own intuition for what you consider "suitably up-to-date" but here are some heuristics of mine: +You should develop your own intuition for what you consider "suitably up-to-date" but here are some heuristics: - If a library has been updated within the last year, it's probably good. - If a library is very small, it probably doesn't need many updates, so longer releases are fine. -- If a library is very old (like some numerical libraries) then it is so well-used, there probably aren't many bugs left, so a release over ten years ago is still probably okay (but might not be very efficient on modern hardware). +- If a library is very old and hasn't been introducing new features (like some numerical libraries) and has been very well-used, there may simply be fewer bugs left to deal with, so a release over ten years ago is still probably okay (but might not be as well optimised on recent hardware). ### Ownership: Who develops the library? -Libraries must be developed by someone; if there is no community or company responsible for a library's development, it is considered *abandoned* and should probably be avoided. Consider some of the following questions: +Libraries must be developed by someone: if there is no community or company responsible for a library's development, it is considered *abandoned* and should probably be avoided. Consider some of the following questions: - Is the library obviously developed by a person, community, company or other organisation? - If a company: @@ -130,7 +130,7 @@ Libraries must be developed by someone; if there is no community or company resp ### Correctness: Is the library well-tested? * Are there many unit tests, do they run, do they pass? - * Are they run automatically (i.e. through continuous integration)? + * Are they run automatically? _Continuous integration_ is a good practice in which code is built and tested automatically when it is updated; this can be automated through e.g. github. * Does the library depend on other libraries? * Are the build tools common? @@ -140,9 +140,9 @@ Beyond the things we've already discussed, there are a few more minor points tha * Documentation: does it exist? is it good? * Number of ToDos: do they keep a track of bugs to fix and future features to implement? -* Dependencies: does it offer a clear list of dependencies? Are they trusted? (i.e., recursively) -* Data Structures: is it clear how to import/export data or images to use later? -* Clear API: can you write a convenient wrapper? +* Dependencies: does it offer a clear list of dependencies? Are they trusted? +* Data Structures: is it clear how to import/export data or images to use later? Do you understand what you need to put in and what you will get out when you use the library's functions? Do you know which functions have side effects (e.g. in place updates?) and what they are? +* Clear API: can you write a convenient wrapper? Is it clear how to use the library's features? ## Libraries you should be using @@ -153,7 +153,7 @@ While you should be asking yourself the above questions to understand how a libr - *very* well-documented - *very* well-used - constantly developed - - no need to install anything! + - all compiler vendors are required to provide it so there are no dependencies to install! - Vendor-provided libraries - provided by Intel/Nvidia/AMD/etc - well-tested - (often) well-documented @@ -163,7 +163,9 @@ While you should be asking yourself the above questions to understand how a libr - well-tested - well-used - (often) well-documented + - typically well optimised - strong communities + - you may find that discussion with your particular research communities will help lead you to appropriate library choices [NatureArticle]: http://www.nature.com/news/2010/101013/full/467753a.html [LicensingBook]: http://www.oreilly.com/openbook/osfreesoft/book/ diff --git a/05libraries/sec02LibraryBasics.md b/06tooling/sec02LibraryBasics.md similarity index 60% rename from 05libraries/sec02LibraryBasics.md rename to 06tooling/sec02LibraryBasics.md index 3db954474..f6c8f3ed7 100644 --- a/05libraries/sec02LibraryBasics.md +++ b/06tooling/sec02LibraryBasics.md @@ -6,9 +6,9 @@ title: Library Basics ### Reviewing the build process -When building an application there are three important steps: +When building an application there are three important steps that the compiler must execute: -- **preprocessing**: follow the directives (lines started by `#`) on the files to combine the units into what's passed to the compiler; +- **preprocessing**: follow the directives (lines started by `#` such as `#include` or `#define`) on the files to combine the units into what's passed to the compiler; - **compilation**: translates the program into machine language code - object files; and - **linking**: merges the various object files and links all the referred libraries as needed to create the executable. @@ -30,13 +30,10 @@ Though normally these steps are invoked by a single command, you can run them on g++ -o ``` -Directly using the compiler without a build tool (e.g., [CMake][lesson-cmake]) will eventually become too difficult and cause a mess. +Directly using the compiler without a build tool (e.g., [CMake][lesson-cmake]) will eventually become too difficult and cause a mess, so we use CMake for larger projects! -Find more details of these steps on the following resources: - -- [The C Preprocessor][CppAdv6] chapter on the [C++: Advanced Topics][CppAdv] course. -- How the C++ [Compiler][CppChernoCompiler] and [Linker][CppChernoLinker] works videos by [The Cherno][Cherno]. +You can find out more about preprocessor directives in [The C Preprocessor][CppAdv6] chapter on the [C++: Advanced Topics][CppAdv] course. ### Including libraries @@ -47,7 +44,10 @@ You need your compiler to find: * Dynamic: `.dylib` (mac), `.so` (linux), `.lib` / `.dll` (windows) * Static: `.a` (\*nix), `.lib` (windows) -We will see the [differences between dynamic and static libraries][lesson-DynVsSt] later. Let's see first how we include the libraries in our code. +We will see more about [differences between dynamic and static libraries][lesson-DynVsSt] later, but in brief: +- A **static** library is compiled with your program and included in your executable code. If you want to change the library behaviour by altering the library or upgrading to a new version, you need to re-compile your exectuable. +- A **dynamically linked** (a.k.a. **shared**) library is compiled separately into a a special kind of library object. You tell the compiler where this object is when you are _linking_, but the library itself is not part of your executable. You can change the library behaviour independently by updating and recompiling the library object, but if the library object is removed or you move your executable to a new system where it can't find the library object, then your executable will no longer work because it doesn't have all the code that it needs to run. Your executable will also not work if you change the dynamic library so that it no longer provides the necessary interface e.g. if you change a function signature in the library that the executable depends on. +- Dynamically linked libraries are also called "shared libraries" because the same library object can be used by multiple executables, so you only need one copy of the compiled library code. On the other hand, if a static library is used by multiple executables then there will be copies of that library code in each of the executables. This can use more space, but makes it easier to keep executables independent and means that executables can be more easily maintined with different _versions_ of the same library without conflicts. ### In practice @@ -67,8 +67,8 @@ g++ -o main main.o -L /users/me/myproject/lib -l mylib When you use a library, keep in mind the following questions: * Where is it? (do we need to set `-I` and `-L` when compiling?) + - Depending on your compiler, there are some standard locations where it will look for [includes](https://gcc.gnu.org/onlinedocs/gcc-4.9.4/cpp/Search-Path.html) or libraries (using `/usr/lib...` or `/usr/local/lib...`), but other locations may need to be provided using `-I` or `-L` flags to let it know where to look. * Is it a [Header only][header-only-wiki]? -* System's version or your version? * What about bugs? How do I upgrade? Do I need to build it myself? ### Compilation Issues @@ -76,16 +76,17 @@ When you use a library, keep in mind the following questions: Also, there are some issues related with the compilation: * Which library version? -* Which compiler version? +* Is there a requisite compiler version / C++ standard? * Debug or Release? -* [Static or Dynamic][learncpp-static-dynamic]? +* Static or Dynamic? * 32 bit / 64 bit? + - Most 64-bit machines can also run 32-bit code, but not the other way around! You should take advantage of 64-bit compilation where you can though. * Platform specific flags? -* Pre-installed, or did you compile it? +* Pre-installed, or do you compile it? ### Wrapping: a technique for avoiding library pain -Some libraries it's obvious that we've made the correct choice, perhaps we've used a library before or someone we trust has recommended it. Other libraries we can be a little more nervous about, perhaps we're not sure it does what we need, or we're worried that in the future we'll need to swap it out with a different library (or worse, write our own!). +Some libraries it's obvious that we've made the correct choice, perhaps we've used a library before or someone we trust has recommended it. Other libraries we can be a little more nervous about, perhaps we're not sure it does what we need, or we're worried that in the future we'll need to swap it out with a different library or write our own. If we think we might need to swap the library out at some future stage, we can *wrap* the library, creating an interface between the library and our own code. This minimises the number of places we must change our code if we ever need to change the library, and we can augment the library to suit our needs. For example, if we know we want to load JSON files, but we're not sure which library to use, we could choose one library, e.g. `json_library`, and write a wrapper around it: @@ -105,8 +106,6 @@ Of course, wrappers are not free; they're more code to write, test, document and [CppAdv]: https://www.linkedin.com/learning/c-plus-plus-advanced-topics/ [CppAdv6]: https://www.linkedin.com/learning/c-plus-plus-advanced-topics/about-the-preprocessor [lesson-cmake]: ../01projects/sec04CMakeHelloWorld.html -[CppChernoCompiler]: https://www.youtube.com/watch?v=3tIqpEmWMLI -[CppChernoLinker]: https://www.youtube.com/watch?v=H4s55GgAg0I [Cherno]: https://www.youtube.com/channel/UCQ-W1KE9EYfdxhL6S4twUNw [lesson-DynVsSt]: ./sec02LinkingLibraries.html [header-only-wiki]: https://en.wikipedia.org/wiki/Header-only diff --git a/05libraries/sec03LinkingLibraries.md b/06tooling/sec03LinkingLibraries.md similarity index 59% rename from 05libraries/sec03LinkingLibraries.md rename to 06tooling/sec03LinkingLibraries.md index e94975780..1bee1ac86 100644 --- a/05libraries/sec03LinkingLibraries.md +++ b/06tooling/sec03LinkingLibraries.md @@ -4,7 +4,7 @@ title: Linking Libraries ## Linking libraries -From the [first lecture][lesson-first] we've seen that: +So far in the course we've seen that: * Code is split into functions/classes * Related functions get grouped into libraries @@ -12,8 +12,11 @@ From the [first lecture][lesson-first] we've seen that: And that the End User of a library needs access to: -* Header file = declarations (and implementation if header only) -* Object code / library file = implementations +* Header files + - These contain required declarations + - May also include implementation if the library is _header only_. This is relatively common for heavily templated code that must be so general that the templates cannot be explicitly instantiated in source files in the library. +* Object code / library file + - This is usually where the implementation is found. The pre-compiled libraries usually can be added to our projects via two mechanism, i.e., via static or dynamic linking. Let's see their differences when using one or the other: @@ -26,13 +29,6 @@ static or dynamic linking. Let's see their differences when using one or the oth * Current translation unit then does not depend on that library (i.e., only have to distribute the executable to run your program). -Find how to create static libraries on the following videos: - -- Cave of Programming's C++ Tutorial [Static Creating Libraries][CPPCoPStatic] (MacOS + Eclipse) -- The Cherno's C++ series [Using Libraries in C++ (Static Linking)][CPPChernoStacic] (Windows + VS2017) -- iFocus's [How to create a static library][ProgLinIF_static] (Linux + CLI) - - ### Dynamic Linking * Windows (.dll), Mac (.dylib), Linux (.so) @@ -48,17 +44,12 @@ Find how to create static libraries on the following videos: recompilation of the executables that uses it (if the interfaces haven't changed). -You can see some examples on how to create dynamic libraries at: - -- iFocus's [How to use a Dynamic Library][ProgLinIF_dyanmic] (Linux + CLI) -- The Cherno's C++ series [Using Dynamic Libraries in C++][CPPChernoDynamic] (Windows + VS2017) - ### Dynamic Loading [Dynamic loading][DynamicLoading-wiki] is a third mechanism that we are not covering here. It's normally used for plugins. -To load the libraries you need using system calls with `dlopen` (\*nix) or +To load the libraries you need using system calls with `dlopen` (Linux or Mac) or `LoadLibrary` (Windows). This allows for dynamically discovering function names and variables. @@ -66,9 +57,17 @@ and variables. ## Linking in practice Though you can do all the linking manually as seen in the [previous -page][lesson-LibBasics], as your project grows it's better to use some tool that -automate the process for you. Check this short video about [how to add a library using CMake][CPPVoBCMakeAddLib]. - +page][lesson-LibBasics], as your project grows it's better to use a build tool like CMake. + +We'll explore building static and shared libraries in the exercises in class, but here are some key pointers: +- The [CMake Tutorial](https://cmake.org/cmake/help/latest/guide/tutorial/index.html) in their official docs has chapters on creating libraries as part of your CMake projects. This is a good starting point for creating an internal library. + - You can switch declare a library `STATIC` or `SHARED` when you add it +- You can compile a library without creating an executable if you want a library to be a separate project: simply use the [`add_library`](https://cmake.org/cmake/help/latest/command/add_library.html) command! You don't need to have an `add_executable`. +- You should probably set the `CMAKE_LIBRARY_OUTPUT_DIRECTORY` variable in your top level CMake so that your compiled library is easy to find, similar to how our executables are placed in `CMAKE_RUNTIME_OUTPUT_DIRECTORY`. +- You can also [import a library into a project](https://cmake.org/cmake/help/latest/command/add_library.html#imported-libraries). If you have compiled your library as a separate project and you want to use it in an executable, you'll need to import it. + - You'll want to set the [`IMPORTED_LOCATION`](https://cmake.org/cmake/help/latest/prop_tgt/IMPORTED_LOCATION.html) property in `set_target_properties` to the location of your compiled (shared or static) library file. + - Make sure to set the executable's include path so it can find the headers for your imported library! +- In practice we often use CMake's [find_package](https://cmake.org/cmake/help/latest/command/find_package.html) command to find libraries that are installed in standard locations, so that the same CMake will work on multiple people's computers who may have libraries in slightly different locations. This saves them having to edit their CMake files to provide an exact location unless necessary. ### Space Comparison @@ -77,21 +76,13 @@ If you have many executables linking a common set of libraries: * Static * Code gets copied - each executable becomes bigger * Doesn't require searching for libraries at run-time + * Code needs to be re-compiled to reflect changes in libraries + * Executables are independent of one another even if they use the same libraries. * Dynamic * Code gets referenced - smaller executable * Requires finding libraries at run-time - -However, space is less of a concern these days! - - -### For Scientists - -As a scientists, we want that our programs are: - -* Easy to use -* Easy to distribute to collaborators - -Therefore, we tend to prefer static if possible for ease of deployment. + * Library code can be updated without re-compiling executables + * Executables which use the same shared library object are all affected by changes to that library, so you need to avoid breaking any interfaces or functionality that other programs rely on. ### Packaging @@ -107,7 +98,7 @@ Some packaging systems available are: * C++ package managers ([conan][ConanPack], [vcpkg][vcpkg], [buckaroo][buckPack]) * General package managers ([spack][SpackPack], [easybuild][ebPack], [conda][condaPack]) -Packaging is not easy, but there are some [things you can do to make package managers cry][make-PM-cry] 😭 +Packaging is not easy, but here is a humorous and enlightening talk about [things you can do to make package managers cry][make-PM-cry]. ### Checking the dependencies diff --git a/05libraries/sec04InstallingLibraries.md b/06tooling/sec04InstallingLibraries.md similarity index 90% rename from 05libraries/sec04InstallingLibraries.md rename to 06tooling/sec04InstallingLibraries.md index d6b64aae5..9f1cbbfde 100644 --- a/05libraries/sec04InstallingLibraries.md +++ b/06tooling/sec04InstallingLibraries.md @@ -14,9 +14,9 @@ Now that we understand what libraries are and how we use them in our code, let's ### Package managers (C++) -Package managers like [conan][conan] can make installing C++ packages as simple as installing Python packages with `pip`. Take a look at the excellent [CMake tutorial with Conan](https://docs.conan.io/en/2.0/tutorial/consuming_packages/build_simple_cmake_project.html). +Package managers like [conan](https://blog.conan.io/2023/02/22/Conan-2.0.html) can make installing C++ packages as simple as installing Python packages with `pip`. Take a look at the excellent [CMake tutorial with Conan](https://docs.conan.io/en/2.0/tutorial/consuming_packages/build_simple_cmake_project.html). -### Package managers (\*nix) +### Package managers (Linux / MacOS) Installing libraries using a package manager (Linux/Mac) has some advantages: diff --git a/05libraries/sec05Summary.md b/06tooling/sec05Summary.md similarity index 92% rename from 05libraries/sec05Summary.md rename to 06tooling/sec05Summary.md index a46784a44..ceab445e6 100644 --- a/05libraries/sec05Summary.md +++ b/06tooling/sec05Summary.md @@ -6,7 +6,7 @@ title: Summary ### No Magic Answer -Managing libraries in C++ is more tricky than other languages. +Managing libraries in C++ is more tricky than some other languages. While package managers make it easier, you still need to understand what you're building. diff --git a/07performance/index.md b/07performance/index.md index 2b1a815c4..ece2108a5 100644 --- a/07performance/index.md +++ b/07performance/index.md @@ -8,19 +8,19 @@ This week we'll be introducing some concepts for producing high performance prog Even though parallelism can help us improve our throughput, single core optimisation is still vital for producing good performance on ordinary machines or for maximising the work that each core can do in a parallel program. This week we'll talk about: -1. [_Why_ and _when_ we should optimise](sec00Motivation) -2. [Complexity and algorithm analysis](sec01Complexity) +1. [_Why_ and _when_ we should optimise](sec00Motivation.html) +2. [Complexity and algorithm analysis](sec01Complexity.html) - How the time and space usage of algorithms scales with the size of the input. - Big-O notation. - How to determine complexity and examples with some common algorithms. - How does complexity impact our choices as software designers? -3. [Memory management and caching](sec02Memory) +3. [Memory management and caching](sec02Memory.html) - Memory bound problems. - How memory is structured in a typical machine. - Speed of different kinds of memory access. - Cache structure and operation. - Writing algorithms to effectively exploit the cache. -4. [Compiler Optimisation](sec03Optimisation) +4. [Compiler Optimisation](sec03Optimisation.html) - Automated optimisation by the compiler. - Compiler flags for optimisation. - Examples of optimisations, pros and cons. diff --git a/07performance/sec01Complexity.md b/07performance/sec01Complexity.md index 19f68f245..200f8cfd1 100644 --- a/07performance/sec01Complexity.md +++ b/07performance/sec01Complexity.md @@ -6,139 +6,139 @@ Estimated Reading Time: 45 minutes # Computational Complexity -Computational complexity is a major field in computer science, and we will only very briefly touch on the subject here. It is foundational in understanding computation as an idea, and underpins many areas of modern computing such as the theory of cryptography. For our purposes it will be enough to get a flavour of how we can study algorithms for their complexity properties, but you may wish to consult the reference texts for this week (or many other standard computer science texts) if you are interested in understanding complexity in a deeper and more rigorous way. +Computational complexity is a major field in computer science, and we will only very briefly touch on the subject here. It is foundational in understanding computation as an idea, and underpins many areas of modern computing such as the theory of cryptography. For our purposes it will be enough to get a flavour of how we can study algorithms for their complexity properties, but you may wish to consult the reference texts for this week (or many other standard computer science texts) if you are interested in understanding complexity in a deeper and more rigorous way. -**Complexity tells us how the time and space usage of solutions to a problem changes with respect to the size of its input.** +**Complexity tells us how the time and space usage of solutions to a problem changes with respect to the size of its input.** -Most commonly in numerical computing scenarios we are concerned with the time complexity of an algorithm, although there are occasions when you may have to worry about space complexity as well. +Most commonly in numerical computing scenarios we are concerned with the time complexity of an algorithm, although there are occasions when you may have to worry about space complexity as well. ## Intuitive Algorithm Analysis -When analysing an algorithm for its complexity, we are interested in how the time (or space requirements) of an algorithm scale as the input gets larger. We can define this notion in a formal way (see below!) but you can understand the key aspects of algorithmic performance based on a few intuitive concepts. +When analysing an algorithm for its complexity, we are interested in how the time (or space requirements) of an algorithm scale as the input gets larger. We can define this notion in a formal way (see below!) but you can understand the key aspects of algorithmic performance based on a few intuitive concepts. In terms of the input getting larger, this could mean: -- The size of a number $n$ for a function $f(n)$. A good example of this would be the time taken to find the prime factors of a number as the number gets larger. -- The number of elements in a container. For example, the time to sort a list of $n$ elements, or the time taken to look up a key-value pair in a map / dictionary. -- The number of dimensions of an $n$-dimensional space. For example in statistical sampling methods where we sample over many variables, we will be interested in how the algorithm performs as the number of dimensions increases. +- The size of a number $n$ for a function $f(n)$. A good example of this would be the time taken to find the prime factors of a number as the number gets larger. +- The number of elements in a container. For example, the time to sort a list of $n$ elements, or the time taken to look up a key-value pair in a map / dictionary. +- The number of dimensions of an $n$-dimensional space. For example in statistical sampling methods where we sample over many variables, we will be interested in how the algorithm performs as the number of dimensions increases. -nThe size of a matrix: this example is a little unusual. Sometimes, particularly for a square ($n \times n$) matrix, this is expressed just by $n$, even though the number of elements in the matrix (and therefore the total size of the input) is actually $n^2$. On algorithms designed to work on non-square $n \times m$ matrices, you may have complexity in terms of _both_ $n$ and $n$. - - Adding two square matrices of the same size together is usually described as $O(n^2)$ with $n$ referring tp just one dimension of the matrix, whereas adding two lists of the same size is usually described as $O(n)$ with $n$ referring to the total number of elements, even though in both cases there is one operation per element of data. This difference is purely because of the way the input size is labelled in these two cases, so watch out for what people mean by $n$ when they tell you something is $O(g(n))$! - - The general matrix case for addition would usually be written $O(nm)$. + - Adding two square matrices of the same size together is usually described as $O(n^2)$ with $n$ referring to just one dimension of the matrix, whereas adding two lists of the same size is usually described as $O(n)$ with $n$ referring to the total number of elements, even though in both cases there is one operation per element of data. This difference is purely because of the way the input size is labelled in these two cases, so watch out for what people mean by $n$ when they tell you something is $O(g(n))$! + - The general matrix case for addition would usually be written $O(nm)$. -The "time" for an algorithm is based on the number of elementary steps that the algorithm has to undertake. +The "time" for an algorithm is based on the number of elementary steps that the algorithm has to undertake. -We generally write the complexity in terms of "Big-O" notation, e.g. $T(n)$ (the time as a function of $n$) is $O(f(n))$. For example if our function printed out the elements of a list, the number of steps we take is proportional to the number of elements in the list (linear scaling), so $T(n)$ is $O(n)$. +We generally write the complexity in terms of "Big-O" notation, e.g. $T(n)$ (the time as a function of $n$) is $O(f(n))$. For example if our function printed out the elements of a list, the number of steps we take is proportional to the number of elements in the list (linear scaling), so $T(n)$ is $O(n)$. When talking about complexity, we only want to capture information about how the time or space scales as $n$ becomes large i.e. asymptotically as $n \rightarrow \infty$. As a result, we only care about dominant terms. Furthermore, we are only interested in the functional form of the scaling, not the absolute amount of time, so any constant factors are ignored. -- Any cubic is $O(n^3)$, any quadratic is $O(n^2)$ etc. regardless of lower order polynomial terms. -- $\log(n)$ is subdominant to $n$. +- Any cubic is $O(n^3)$, any quadratic is $O(n^2)$ etc. regardless of lower order polynomial terms. +- $\log(n)$ is subdominant to $n$. - Constant overheads (additive constants to the time) are ignored as they don't scale with $n$. -- A function is $O(1)$ if it doesn't scale with $n$ at all. +- A function is $O(1)$ if it doesn't scale with $n$ at all. - $O(1) < O(log(n)) < O(n) < O(n log(n)) < O(n^2) < O(n^3) < O(2^n)$ -- Algorithms whose time complexity is $O(p(n))$, where $p(n)$ is a polynomial, or better are called _polynomial time algorithms_. +- Algorithms whose time complexity is $O(p(n))$, where $p(n)$ is a polynomial, or better are called _polynomial time algorithms_. We can also understand algorithms made of smaller parts, for example: -- If an algorithm calculates $f(n)$ which is $O(n^3)$ then $g(n)$ which is $O(n^2)$, then the complexity of the algorithm is $O(n^3)$ since caculating $g(n)$ will become subdominant. -- If we make $n$ calls to a function $f(n)$, and $f(n)$ is $O(g(n))$, then the complexity is $O(n g(n))$. For example, making $n$ calls to a quadric-scaling function would lead to a cubic, i.e. $O(n^3)$, algorithm. - - Nested loops and recursions are key areas of your program to look at to see if complexity is piling up! +- If an algorithm calculates $f(n)$ which is $O(n^3)$ then $g(n)$ which is $O(n^2)$, then the complexity of the algorithm is $O(n^3)$ since calculating $g(n)$ will become subdominant. +- If we make $n$ calls to a function $f(n)$, and $f(n)$ is $O(g(n))$, then the complexity is $O(n g(n))$. For example, making $n$ calls to a quadric-scaling function would lead to a cubic, i.e. $O(n^3)$, algorithm. + - Nested loops and recursions are key areas of your program to look at to see if complexity is piling up! - Recursions or other kinds of branching logic can lead to recurrence relations: the time to calculate a problem can be expressed in terms of the time to calculate a smaller problem. This recurrence relation is directly linked to the complexity: - - If $T(n) \sim 2 \times T(\frac{n}{2})$ then $T(n)$ is linear. - - If $T(n) \sim 4 \times T(\frac{n}{2})$ then $T(n)$ is quadric. - - If $T(n) \sim k + T(\frac{n}{2})$ for constant $k$ then $T(n)$ is logarithmic. - - See the [Master Theorem](https://en.wikipedia.org/wiki/Master_theorem_(analysis_of_algorithms)) for more on this if you're interested! - - This is related to our intuition about how the time scales with $n$: if we know that a algorithm is $O(n^2)$ for example, we then know that as the input gets sufficiently large, the time taken to calculate the output will quadruple as $n$ doubles. + - If $T(n) \sim 2 \times T(\frac{n}{2})$ then $T(n)$ is linear. + - If $T(n) \sim 4 \times T(\frac{n}{2})$ then $T(n)$ is quadric. + - If $T(n) \sim k + T(\frac{n}{2})$ for constant $k$ then $T(n)$ is logarithmic. + - See the [Master Theorem](https://en.wikipedia.org/wiki/Master_theorem_(analysis_of_algorithms)) for more on this if you're interested! + - This is related to our intuition about how the time scales with $n$: if we know that a algorithm is $O(n^2)$ for example, we then know that as the input gets sufficiently large, the time taken to calculate the output will quadruple as $n$ doubles. ## Algorithm Analysis: Sorting Algorithm Examples -Let's take a look at two sorting algorithms and try to understand something about their complexity. +Let's take a look at two sorting algorithms and try to understand something about their complexity. ### **Insertion Sort** -Insertion sort is one of the easiest sorting methods. We'll sort from smallest to largest. We'll show an _out of place_ implementation because it is easier to visualise, but this can be performed _in place_ (i.e. changing the elements of the list as we go instead of putting them in a new list). +Insertion sort is one of the easiest sorting methods. We'll sort from smallest to largest. We'll show an _out of place_ implementation because it is easier to visualise, but this can be performed _in place_ (i.e. changing the elements of the list as we go instead of putting them in a new list). -1. It starts with an unsorted list, and in our case an empty buffer to place the elements into. This buffer will store the sorted list at the end and has the property that *it is correctly sorted at every step*. Elements are placed into the output list one by one. Since it starts empty, the first element can go straight in. +1. It starts with an unsorted list, and in our case an empty buffer to place the elements into. This buffer will store the sorted list at the end and has the property that _it is correctly sorted at every step_. Elements are placed into the output list one by one. Since it starts empty, the first element can go straight in. -![image](img/insert_sort_1.jpg) + ![image](img/insert_sort_1.jpg) -2. When placing the next element we compare with each element in the list in turn (starting with the end of the list) to see if it is smaller or larger. If the element is smaller (like here) then we can insert our element. +2. When placing the next element we compare with each element in the list in turn (starting with the end of the list) to see if it is smaller or larger. If the element is smaller (like here) then we can insert our element. -![image](img/insert_sort_2.jpg) + ![image](img/insert_sort_2.jpg) -3. We then try to insert the next element, again comparing with the last element of the output list. If we find a value that is bigger than the one we want to insert then we need to move along to the next element. +3. We then try to insert the next element, again comparing with the last element of the output list. If we find a value that is bigger than the one we want to insert then we need to move along to the next element. -![image](img/insert_sort_4.jpg) + ![image](img/insert_sort_4.jpg) -4. Once we've found one smaller we know where to insert the new element. All the elements in the output list to the right will need to be shifted along to make room. +4. Once we've found one smaller we know where to insert the new element. All the elements in the output list to the right will need to be shifted along to make room. -![image](img/insert_sort_3.jpg) + ![image](img/insert_sort_3.jpg) -5. In order to insert this in the right place, we have to move the larger element along one. (This would also mean moving along any elements to the right of this one!) +5. In order to insert this in the right place, we have to move the larger element along one. (This would also mean moving along any elements to the right of this one!) -![image](img/insert_sort_5.jpg) + ![image](img/insert_sort_5.jpg) -6. We can then insert our input element in the space and move on to the next element in the unsorted list. Again we compare until we find a smaller value; in this case we get all the way to the start of the list. Now all of the other elements need to be shifted along in order to insert this element. +6. We can then insert our input element in the space and move on to the next element in the unsorted list. Again we compare until we find a smaller value; in this case we get all the way to the start of the list. Now all of the other elements need to be shifted along in order to insert this element. -![image](img/insert_sort_6.jpg) + ![image](img/insert_sort_6.jpg) -7. We continue until the all elements have been inserted into the output list. Since the output list is always sorted, as soon as we have inserted all the elements we are done. +7. We continue until the all elements have been inserted into the output list. Since the output list is always sorted, as soon as we have inserted all the elements we are done. What is the time complexity of this? Well, the performance of this algorithm depends on the pattern of the input! There are always $n$ insertions but: -- If the input is already sorted, then every step involves just one comparison. (In the case of an out of place algorithm, an insertion/copy; for an in place sort then this is skipped.) This is therefore linear in the size of the list i.e. $O(n)$. -- If the output is reverse sorted, then every step involves comparing with every element of the output list and shifting every element as well! The output list grows by one each time so this is proportional to $1 + 2 + 3 + ... = \frac{n^2 + n}{2}$ and therefore $O(n^2)$. +- If the input is already sorted, then every step involves just one comparison. (In the case of an out of place algorithm, an insertion/copy; for an in place sort then this is skipped.) This is therefore linear in the size of the list i.e. $O(n)$. +- If the output is reverse sorted, then every step involves comparing with every element of the output list and shifting every element as well! The output list grows by one each time so this is proportional to $1 + 2 + 3 + ... = \frac{n^2 + n}{2}$ and therefore $O(n^2)$. -These are the best case and worse case scenarios. In practice, the average case is still $O(n^2)$, and in general we are usually most concerned with our _worst case_ complexity. Nevertheless, if you have _nearly sorted_ data you can have good performance for an algorithm like this, and it is important to understand how your algorithms are impacted by patterns in the data that you are working on. +These are the best case and worse case scenarios. In practice, the average case is still $O(n^2)$, and in general we are usually most concerned with our _worst case_ complexity. Nevertheless, if you have _nearly sorted_ data you can have good performance for an algorithm like this, and it is important to understand how your algorithms are impacted by patterns in the data that you are working on. ### **Merge Sort** Merge sort is a "divide and conquer" algorithm: the idea is that we can easily build a sorted list by merging two shorter sorted lists each containing half of the elements. -1. Merging two sorted lists is linear in the size of the list. At each step we compare the head of the two lists. +1. Merging two sorted lists is linear in the size of the list. At each step we compare the head of the two lists. -![image](img/merge_sort_1.jpg) + ![image](img/merge_sort_1.jpg) -2. Whichever is smaller goes in the output list and then we make the next compare the next pair and insert the smaller and so on. +2. Whichever is smaller goes in the output list and then we make the next compare the next pair and insert the smaller and so on. -![image](img/merge_sort_2.jpg) + ![image](img/merge_sort_2.jpg) -3. This is obviously done after $n-1$ comparisons and $n$ insertions so this part is $O(n)$ in the combined size of the lists. +3. This is obviously done after $n-1$ comparisons and $n$ insertions so this part is $O(n)$ in the combined size of the lists. -Naturally a list with only one element is sorted, and so we can turn this into a recursive sorting algorithm using the list with one element as a base case. +Naturally a list with only one element is sorted, and so we can turn this into a recursive sorting algorithm using the list with one element as a base case. 1. The list is recursively divided until you have single elements. Then each of these is merged pair-wise to make sorted lists of size 2. (Merging all the lists involves $O(n)$ operations just as discussed above.) -![image](img/merge_sort_7.jpg) + ![image](img/merge_sort_7.jpg) -2. These lists are merged pairwise at each level of recursion until we have the fully sorted list. +2. These lists are merged pairwise at each level of recursion until we have the fully sorted list. -![!image](img/merge_sort_9.jpg) + ![!image](img/merge_sort_9.jpg) -The complexity of this algorithm does not depend on any patterns in the list: a sorted, reverse sorted, or unsorted list still involves the same number of comparisons and merges. +The complexity of this algorithm does not depend on any patterns in the list: a sorted, reverse sorted, or unsorted list still involves the same number of comparisons and merges. -Each round of merging takes $O(n)$ operations, so we need to know how many rounds there are. We split the list in half at each level of recursion, and we can only do this $\sim log_2(n)$ times before we have lists of size 1, therefore the total complexity is $O(n log(n))$. +Each round of merging takes $O(n)$ operations, so we need to know how many rounds there are. We split the list in half at each level of recursion, and we can only do this $\sim log_2(n)$ times before we have lists of size 1, therefore the total complexity is $O(n log(n))$. - $O(n log(n))$ is actually optimal asymptotic behaviour for comparison sorting a list! -- $O(n log(n))$ is sometimes called "quasi-linear". +- $O(n log(n))$ is sometimes called "quasi-linear". - We can also approach this problem using a recurrence relation: $T(n) \sim O(n) + 2T(\frac{n}{2})$. ### *Comparison* - Merge sort has better asymptotic behaviour than insertion sort in the average and worst case. -- Insertion sort performance can depend strongly on the data. -- Merge sort has the same best, average, and worst case complexity so is very predictable. -- Merge sort has higher overheads due to all those recursive function calls. -- As a result, we know that merge sort will eventually beat insertion sort as long as $n$ becomes large enough, but insertion sort may provide better performance for small lists. - - Another popular algorithm is _quicksort_, which has $O(n log(n))$ behaviour in the average case (and is usually faster than merge sort), but $O(n^2)$ behaviour in the worst case. Selecting the best algorithms is not always obvious! +- Insertion sort performance can depend strongly on the data. +- Merge sort has the same best, average, and worst case complexity so is very predictable. +- Merge sort has higher overheads due to all those recursive function calls. +- As a result, we know that merge sort will eventually beat insertion sort as long as $n$ becomes large enough, but insertion sort may provide better performance for small lists. + - Another popular algorithm is _quicksort_, which has $O(n log(n))$ behaviour in the average case (and is usually faster than merge sort), but $O(n^2)$ behaviour in the worst case. Selecting the best algorithms is not always obvious! ## The Complexity of a Problem: Matrix Multiplication - As well as analysing the performance of a specific algorithm, one can look at the inherent complexity of a problem itself: with what asymptotic behaviour is it _possible_ to solve a problem? When discussing the instrinsic complexity of a problem, the complexity of best solution we have provides an upper bound since we know we can do it _at least that well_, although we don't know if we could do better. Getting more precise knowledge of the inherent complexity of many problems is an active area of research. (And if you can solve the $P=NP$ problem [you get $1,000,000!](https://en.wikipedia.org/wiki/Millennium_Prize_Problems)) + As well as analysing the performance of a specific algorithm, one can look at the inherent complexity of a problem itself: with what asymptotic behaviour is it _possible_ to solve a problem? When discussing the intrinsic complexity of a problem, the complexity of best solution we have provides an upper bound since we know we can do it _at least that well_, although we don't know if we could do better. Getting more precise knowledge of the inherent complexity of many problems is an active area of research. (And if you can solve the $P=NP$ problem [you get $1,000,000!](https://en.wikipedia.org/wiki/Millennium_Prize_Problems)) Let's take as an example the problem of matrix multiplication, an extremely common operation in scientific computing. What is the complexity of matrix multiplication? What algorithms are available to us and how do they get used in practice? -### The Naïve Algorithm +### The Naïve Algorithm We can define the elements of a product of two matrices as @@ -149,35 +149,35 @@ $C_{ij} = A_{ik} B_{kj}$. The simplest way to implement this is to iterate over $i$ and $j$, and for each element in the product matrix you perform a dot product between the $i$ row of $A$ and $j$ column of $B$ (which iterates over $k$). - Assuming our matrices are $N \times N$: - - There are $N^2$ elements to do this for in the product matrix. - - Calculating each element requires $N$ multiplications. - - Total number of multiplications is therefore $N^3$. - - We can see this immediately because we have nested iterations over $i$, $j$, and $k$, each of which has $N$ values. - - Nothing else (e.g. memory read/writes) exceeds $N^3$ behaviour either. ($N^2$ writes, up to $N^3$ reads.) + - There are $N^2$ elements to do this for in the product matrix. + - Calculating each element requires $N$ multiplications. + - Total number of multiplications is therefore $N^3$. + - We can see this immediately because we have nested iterations over $i$, $j$, and $k$, each of which has $N$ values. + - Nothing else (e.g. memory read/writes) exceeds $N^3$ behaviour either. ($N^2$ writes, up to $N^3$ reads.) -So from a simple solution we can see that matrix multiplication is $O(n^3)$. It can't be _worse_ than asymptotically $n^3$, but it could be _better_, since we haven't shown that it is bounded from below by $n^3$. +So from a simple solution we can see that matrix multiplication is $O(n^3)$. It can't be _worse_ than asymptotically $n^3$, but it could be _better_, since we haven't shown that it is bounded from below by $n^3$. -We _can_ say that matrix multiplication must be bounded from below by $n^2$ (written as $\Omega(n^2)$ when it's a _lower_ bound; this is also defined formally below) since it needs to fill out $n^2$ values in the output matrix! +We _can_ say that matrix multiplication must be bounded from below by $n^2$ (written as $\Omega(n^2)$ when it's a _lower_ bound; this is also defined formally below) since it needs to fill out $n^2$ values in the output matrix! -### (Asymptotically) Better Matrix Multiplication +### (Asymptotically) Better Matrix Multiplication The most common improved matrix multiplication algorithm is the _Strassen algorithm_. We won't go into the details of how the linear algebra works here, but we simply note: - The Strassen algorithm divides each matrix into four (roughly equal) sub-matrices. -- Normally a matrix multiplication could be calculated using these sub-matrices by calculating 8 small matrix products. This wouldn't save time: the naïve method is $O(n^3)$ and each matrix has size $\sim \frac{n}{2}$, so each sub-matrix multiplication is 8 times as fast, but there are also 8 matrix multiplications to do! -- By combining some of these sub-matrices through additions and subtractions (which are $O(n^2)$) we can actually express the output matrix with just 7 small matrix products! (Again, some addition and subtraction required to build the output matrix after the products are calculated.) -- The additions and subtractions are negligible because there are a fixed number and they are $O(n^2)$ and therefore sub-dominant. +- Normally a matrix multiplication could be calculated using these sub-matrices by calculating 8 small matrix products. This wouldn't save time: the naïve method is $O(n^3)$ and each matrix has size $\sim \frac{n}{2}$, so each sub-matrix multiplication is 8 times as fast, but there are also 8 matrix multiplications to do! +- By combining some of these sub-matrices through additions and subtractions (which are $O(n^2)$) we can actually express the output matrix with just 7 small matrix products! (Again, some addition and subtraction required to build the output matrix after the products are calculated.) +- The additions and subtractions are negligible because there are a fixed number and they are $O(n^2)$ and therefore sub-dominant. - Doing this recursively for sub-matrices as well leads to an algorithm which is $\sim O(n^{2.8...})$. - - Extra: To prove this you can write the number of operations as a recurrence relation $T(n) = 7T(\frac{n}{2}) + O(n^2)$ and apply the [Master Theorem](https://en.wikipedia.org/wiki/Master_theorem_(analysis_of_algorithms)). Again the recommended texts for this week are a good place to start! + - Extra: To prove this you can write the number of operations as a recurrence relation $T(n) = 7T(\frac{n}{2}) + O(n^2)$ and apply the [Master Theorem](https://en.wikipedia.org/wiki/Master_theorem_(analysis_of_algorithms)). Again the recommended texts for this week are a good place to start! This may seem like a small gain, but it becomes increasingly important as matrices get large, and large matrix multiplication can be a bottleneck for many numerical codes! We should bear in mind: -- This algorithm has more overheads than normal matrix multiplication. Asymptotic improvement does not mean it's better for all sizes of problems! For small matrices, the additional overheads of Strassen multiplication make it slower than the straightforward method. As a result, most implementations actually transition to regular matrix multiplication when the sub-matrices get to be small enough in size that this becomes more efficient. This kind of behavioural change is very common when optimising for performance. -- The Strassen algorithm is less numerically stable than the simple method, although not so much so as to be an issue in typical applications. This is again a common trade off with code optimised for speed, and something to bear in mind if you know that you have to deal with unstable edge cases. +- This algorithm has more overheads than normal matrix multiplication. Asymptotic improvement does not mean it's better for all sizes of problems! For small matrices, the additional overheads of Strassen multiplication make it slower than the straightforward method. As a result, most implementations actually transition to regular matrix multiplication when the sub-matrices get to be small enough in size that this becomes more efficient. This kind of behavioural change is very common when optimising for performance. +- The Strassen algorithm is less numerically stable than the simple method, although not so much so as to be an issue in typical applications. This is again a common trade off with code optimised for speed, and something to bear in mind if you know that you have to deal with unstable edge cases. -So is this the best known performance for matrix multiplication? Not quite! The best known algorithm for matrix multiplication is $\sim O(n^{2.37...})$, but this is unusable in practice. It is an example of what is called a _galactic algorithm_; algorithms which have better asymptotic behaviour, but whose overheads are so large that they actually perform worse for any input that could physically fit on any plausible machine. It is a good reminder that asymptotic behaviour isn't everything! +So is this the best known performance for matrix multiplication? Not quite! The best known algorithm for matrix multiplication is $\sim O(n^{2.37...})$, but this is unusable in practice. It is an example of what is called a _galactic algorithm_; algorithms which have better asymptotic behaviour, but whose overheads are so large that they actually perform worse for any input that could physically fit on any plausible machine. It is a good reminder that asymptotic behaviour isn't everything! ## Formal Definition: "Big O" Notation @@ -188,47 +188,47 @@ $f(n)$ is $O(g(n))$ if and only if there exists some minimum value $n_0$ and som Let's break down what this means: - **A function $f(n)$ is $O(g(n))$ if, as $n$ tends to infinity, $f(n)$ is bounded from above by $g(n)$ multiplied by an arbitrary constant.** -- Because we only require that the comparison holds for all inputs larger than some arbitrarily large value $n_0$, we are looking at _asymptotic behaviour_. We say that $g(n)$ is an **asymptotic upper bound** on $f(n)$. - - Practically speaking this means we are generally only interested in _leading terms_. For a quadratic $an^2 + bn + c$, the linear and constant terms will always be overwhelmed by the quadratic term in the asymptotic case, and so are irrelevant. All quadratics are $O(n^2)$, regardless of the values of $a$, $b$, and $c$ (as long as $a \ne 0$). - - For example: $\mid an^2 + bn + c \mid \le (a+1)n^2$ will always hold for $n$ large enough so that $n^2 > \mid bn + c \mid$, which we know must exist for any coefficients $b$ and $c$. +- Because we only require that the comparison holds for all inputs larger than some arbitrarily large value $n_0$, we are looking at _asymptotic behaviour_. We say that $g(n)$ is an **asymptotic upper bound** on $f(n)$. + - Practically speaking this means we are generally only interested in _leading terms_. For a quadratic $an^2 + bn + c$, the linear and constant terms will always be overwhelmed by the quadratic term in the asymptotic case, and so are irrelevant. All quadratics are $O(n^2)$, regardless of the values of $a$, $b$, and $c$ (as long as $a \ne 0$). + - For example: $\mid an^2 + bn + c \mid \le (a+1)n^2$ will always hold for $n$ large enough so that $n^2 > \mid bn + c \mid$, which we know must exist for any coefficients $b$ and $c$. - We have an arbitrary constant factor $\alpha$, so constant factors in $f(n)$ are irrelevant. $n^2$ and $50n^2$ are both $O(n^2)$. We are only interested in the **way that the output scales**, not the actual size. This will be a very important point to remember when applying algorithms in practice! - For this definition to make sense and be useful, the function $g(n)$ must be asymptotically non-negative. -- This can also be written treating $O(g(n))$ as the set of functions that asymptotically bounded by $g(n)$. - - $f(n) \in O(g(n)) \iff \exists m, \alpha . \forall n > m . \mid f(n) \mid \le \alpha g(n)$. - - Texts can vary quite a bit in notation e.g. $f(n) \in g(n)$ or $f(n) = g(n)$ (a bit of an abuse of notation since it is not an equivalence relation, but fairly common in practice), and in mathematical formality. +- This can also be written treating $O(g(n))$ as the set of functions that asymptotically bounded by $g(n)$. + - $f(n) \in O(g(n)) \iff \exists m, \alpha . \forall n > m . \mid f(n) \mid \le \alpha g(n)$. + - Texts can vary quite a bit in notation e.g. $f(n) \in g(n)$ or $f(n) = g(n)$ (a bit of an abuse of notation since it is not an equivalence relation, but fairly common in practice), and in mathematical formality. -We can also write $f(n)$ is $\Omega(g(n))$ if $g(n)$ is an **asymptotic lower bound** of $f(n)$. This is a reciprocal relationship with $O(n)$ i.e. +We can also write $f(n)$ is $\Omega(g(n))$ if $g(n)$ is an **asymptotic lower bound** of $f(n)$. This is a reciprocal relationship with $O(n)$ i.e. -$f(n)$ is $O(g(n))$ if and only if $g(n)$ is $\Omega(f(n))$. +$f(n)$ is $O(g(n))$ if and only if $g(n)$ is $\Omega(f(n))$. There is also a stronger condition: $f(n)$ is $\Theta(g(n))$ if and only if $f(n)$ is $O(g(n))$ **and** $g(n)$ is $O(f(n))$. -- This means that $g(n)$ is an **asymptotically tight bound** on $f(n)$, because as $n$ tends to infinity $f(n)$ is bounded by a constant multiple of $g(n)$, and $g(n)$ is bounded by a constant multiple of $f(n)$. -- Put another way, there are two constants $\alpha$ and $\beta$ for which, as $n$ tends to infinity, $f(n) \ge \alpha g(n)$ and $f(n) \le \beta g(n)$. As such, $f(n)$ is bounded from above and below by multiples of $g(n)$. - - $f(n)$ is $\Theta(g(n))$ if and only if $f(n)$ is $O(g(n))$ and $f(n)$ is $\Omega(g(n))$. +- This means that $g(n)$ is an **asymptotically tight bound** on $f(n)$, because as $n$ tends to infinity $f(n)$ is bounded by a constant multiple of $g(n)$, and $g(n)$ is bounded by a constant multiple of $f(n)$. +- Put another way, there are two constants $\alpha$ and $\beta$ for which, as $n$ tends to infinity, $f(n) \ge \alpha g(n)$ and $f(n) \le \beta g(n)$. As such, $f(n)$ is bounded from above and below by multiples of $g(n)$. + - $f(n)$ is $\Theta(g(n))$ if and only if $f(n)$ is $O(g(n))$ and $f(n)$ is $\Omega(g(n))$. This relationship is symmetric: $f(n)$ is $\Theta(g(n))$ if and only if $g(n)$ is $\Theta(f(n))$. -Asymptotically tight bounds are particularly useful for understanding the behaviour as a function scales. -Take our quadratic example from before: $an^2 + bn + c \in O(n^2)$ is clearly true, but looking at the definition of $O(g(n))$ we can also say that $an^2 + bn + c \in O(n^3)$, since $n^3$ is asymptotically larger than any quadratic function and therefore acts as an upper bound. The same would be true of many functions which grow faster than quadratics! Likewise $O(n^2)$ includes anything that grows slower than a quadratic, such as a linear function. To say that our quadratic is $\Theta(n^2)$ is much stricter, as it says that our function grows as fast as $n^2$ and no faster than $n^2$ (up to multiplicative constants). +Asymptotically tight bounds are particularly useful for understanding the behaviour as a function scales. +Take our quadratic example from before: $an^2 + bn + c \in O(n^2)$ is clearly true, but looking at the definition of $O(g(n))$ we can also say that $an^2 + bn + c \in O(n^3)$, since $n^3$ is asymptotically larger than any quadratic function and therefore acts as an upper bound. The same would be true of many functions which grow faster than quadratics! Likewise $O(n^2)$ includes anything that grows slower than a quadratic, such as a linear function. To say that our quadratic is $\Theta(n^2)$ is much stricter, as it says that our function grows as fast as $n^2$ and no faster than $n^2$ (up to multiplicative constants). ## Why use Big-O Notation for Algorithms and Computational Problems? -For algorithmic analysis, the functions that we are interested in are the time and space usage of an algorithm as a function of its input size. +For algorithmic analysis, the functions that we are interested in are the time and space usage of an algorithm as a function of its input size. -- Time usage is usually understood in terms of the number of "steps" that an algorithm needs to reach a result. How exactly that translates into time in the real world depends on how long each kind of operation takes to do (e.g. memory read, comparison, additions etc.), but these are multiplicative factors. - - Note that sometimes things which might appear to be a simple step are more complex. For example, if performing additions with arbitrary precision integers then the time it takes to perform the addition will vary with the size of the number! If using fixed precision then this is not an issue because you know that e.g. a standard `int` is 4 bytes, and so even if the addition is optimised in some way to add smaller numbers quicker they are still bounded by the time it would take to operate on 4 bytes. -- Space is usually a bit easier to understand as we can reason more directly about the amount of information that we have to store. - - When analysing space complexity we do not include the input itself, just any memory that must be allocated for working on. +- Time usage is usually understood in terms of the number of "steps" that an algorithm needs to reach a result. How exactly that translates into time in the real world depends on how long each kind of operation takes to do (e.g. memory read, comparison, additions etc.), but these are multiplicative factors. + - Note that sometimes things which might appear to be a simple step are more complex. For example, if performing additions with arbitrary precision integers then the time it takes to perform the addition will vary with the size of the number! If using fixed precision then this is not an issue because you know that e.g. a standard `int` is 4 bytes, and so even if the addition is optimised in some way to add smaller numbers quicker they are still bounded by the time it would take to operate on 4 bytes. +- Space is usually a bit easier to understand as we can reason more directly about the amount of information that we have to store. + - When analysing space complexity we do not include the input itself, just any memory that must be allocated for working on. - What we mean by the "input size" can be ambiguous. Traditionally it can be more rigorously defined in terms of tape size on a Turing machine (which we we won't have time to cover!) or bits on a computer, but in practice people may typically reason with the kinds of intuitive values mentioned before which would correspond with the input size. -Big-O, $\Omega$, and $\Theta$ all capture information about algorithm performance without knowing too much detail about how it is physically performed on a computer: things like the exact amount of time for particular operations, the differences in how memory is divided up for reading and writing, and so on get absorbed into multiplicative factors or additive overheads. Big-O notation captures something more fundamental about the way that problems scale. Even things like modern CPUs doing multiple arithmetic operations in parallel don't affect the computational complexity of an algorithm, since there are still a fixed number of operations than can happen concurrently and therefore this can't contribute more than a constant factor. +Big-O, $\Omega$, and $\Theta$ all capture information about algorithm performance without knowing too much detail about how it is physically performed on a computer: things like the exact amount of time for particular operations, the differences in how memory is divided up for reading and writing, and so on get absorbed into multiplicative factors or additive overheads. Big-O notation captures something more fundamental about the way that problems scale. Even things like modern CPUs doing multiple arithmetic operations in parallel don't affect the computational complexity of an algorithm, since there are still a fixed number of operations than can happen concurrently and therefore this can't contribute more than a constant factor. -Take for example a trivial summation example: +Take for example a trivial summation example: ```cpp= double SumVector(const vector &v) @@ -242,20 +242,19 @@ double SumVector(const vector &v) } ``` -- We're interested here in how we scale with the number of elements in the list, so we'll call this $n$. -- There is one addition operation for each element in the list, so $n$ operations total. Time complexity is $\Theta(n)$ i.e. _linear_. -- Regardless of the size of the list, we only allocate one `double` (`sum`) for this function, so the space complexity is $\Theta(1)$ i.e. _constant_. +- We're interested here in how we scale with the number of elements in the list, so we'll call this $n$. +- There is one addition operation for each element in the list, so $n$ operations total. Time complexity is $\Theta(n)$ i.e. _linear_. +- Regardless of the size of the list, we only allocate one `double` (`sum`) for this function, so the space complexity is $\Theta(1)$ i.e. _constant_. For **algorithmic analysis** we can often determine asymptotically tight bounds because we know exactly how an algorithm will behave. ## Summary of Complexity in Practice -- It's good to be aware of the complexity bounds on problems and algorithms that you will be working with, like matrix multiplication, matrix inversion, data lookups in different kinds of structures etc. - - Functions in the standard C++ library will often have known complexity given in the documentation, for example accessing an [ordered map](https://cplusplus.com/reference/map/map/operator[]/) is $O(log(n))$ and accessing an [unordered map](https://cplusplus.com/reference/unordered_map/unordered_map/operator[]/) is $O(1)$ in the average case and $O(n)$ in the worst case. -- Be aware when you are writing code of how your program scales. Think about things like nested loops and recursions and how their contributions combine. -- Algorithms with high complexity can become bottlenecks for your programs if their inputs become large. -- More efficient algorithms can often be a trade-off between time and accuracy or other desirable properties. +- It's good to be aware of the complexity bounds on problems and algorithms that you will be working with, like matrix multiplication, matrix inversion, data lookups in different kinds of structures etc. + - Functions in the standard C++ library will often have known complexity given in the documentation, for example accessing an [ordered map](https://cplusplus.com/reference/map/map/operator[]/) is $O(log(n))$ and accessing an [unordered map](https://cplusplus.com/reference/unordered_map/unordered_map/operator[]/) is $O(1)$ in the average case and $O(n)$ in the worst case. +- Be aware when you are writing code of how your program scales. Think about things like nested loops and recursions and how their contributions combine. +- Algorithms with high complexity can become bottlenecks for your programs if their inputs become large. +- More efficient algorithms can often be a trade-off between time and accuracy or other desirable properties. - Algorithms with good asymptotic performance often perform less well on small problems; don't waste time or sacrifice accuracy to use these kinds of methods on small data where they won't benefit! - Some algorithms have good average case complexity but very bad performance in special cases. Make sure that you're aware of any patterns that you expect in your data and that your chosen algorithm is not impacted by them. - **Well implemented methods often switch between algorithms depending on the size and nature of the data input to get the best performance.** - \ No newline at end of file diff --git a/07performance/sec02Memory.md b/07performance/sec02Memory.md index e879acdd5..a717b9a24 100644 --- a/07performance/sec02Memory.md +++ b/07performance/sec02Memory.md @@ -6,16 +6,16 @@ Estimated Reading Time: 30 minutes # Memory -Managing memory efficiently can be an important part of achieving peak performance. In this section we'll talk a bit about the basic model of how data is stored and accessed, and what this means for software development. +Managing memory efficiently can be an important part of achieving peak performance. In this section we'll talk a bit about the basic model of how data is stored and accessed, and what this means for software development. ## Memory Bound Problems When considering the efficiency of solving a problem on a computer, two classifications can sometimes be useful: -- **Compute bound** problems are those for which the main work or primary bottleneck is the number of compute steps required to complete the algorithm. For example, performing lots of arithmetic operations on a piece of data. -- **Memory bound** problems are those for which our main concern is the time spent accessing (reading or writing) memory. For example, copying or overwriting a large piece of data. +- **Compute bound** problems are those for which the main work or primary bottleneck is the number of compute steps required to complete the algorithm. For example, performing lots of arithmetic operations on a piece of data. +- **Memory bound** problems are those for which our main concern is the time spent accessing (reading or writing) memory. For example, copying or overwriting a large piece of data. -A straight-forward example of a memory bound problem would be a matrix transposition, $M^T_{ij} = M_{ji}$. This problem doesn't require any direct calculations to be done on the elements themselves, just to read the elements from one location and place them in another. +A straight-forward example of a memory bound problem would be a matrix transposition, $M^T_{ij} = M_{ji}$. This problem doesn't require any direct calculations to be done on the elements themselves, just to read the elements from one location and place them in another. To keep things simple, let's look at this "out of place" matrix transpose: @@ -33,63 +33,62 @@ void Transpose(vector> &A, vector> &B) } ``` -- "Out of place" means that the result is placed in a new matrix rather than over-writing the original. +- "Out of place" means that the result is placed in a new matrix rather than over-writing the original. - This algorithm is almost entirely composed of memory read/write operations! -- In order to understand how to optimise a memory bound problem like this, we have to first understand the structure of memory in our machine. +- In order to understand how to optimise a memory bound problem like this, we have to first understand the structure of memory in our machine. ## The Memory Hierarchy Data is stored in a computer in different forms and different places. Generally, the bigger your memory store, the slower it is to access! This trade-off is typical for developing on many architectures from PCs to specialised accelerated hardware. On a typical computer you're likely to have: - **Persistent Memory** - - This is the largest store of data and includes things like your hard disk. Writing to this kind of memory has the highest overheads, but the data remains intact without power and is typically on the scale of tens of GB or more. + - This is the largest store of data and includes things like your hard disk. Writing to this kind of memory has the highest overheads, but the data remains intact without power and is typically on the scale of tens of GB or more. - **System Memory** - - This includes on-chip RAM and ROM. - - ROM is permanent, and read-only, and generally not of much interest to software developers since it handles things like instructions for basic I/O and loading the operating system: things that we don't (and can't) mess with! - - RAM is generally volatile memory, meaning that it requires power to be maintained: if you turn off your computer you will typically lose what is in RAM. Usually on the scale of a few GB. Contains the current program and data in use. - - Stack: Stack memory is usually $\lesssim 10$ MB, assigned by the operating system when the program is launched. Stack memory cannot be increased while the program is running. Contains data for currently open scopes i.e. the function currently being executed and any hierarchy of functions which are calling it and therefore have not terminated yet. Very large pieces of data or very deep call trees (e.g. excessively deep recursion) can cause a _stack overflow_, where the stack runs out of memory. Stack memory is generally faster than Heap memory. - - Heap: The Heap is a larger pool of memory, also assigned by the operating system at the program launch, but the heap size allocated to a program can grow dynamically as long as there is enough space left in RAM. Memory access tends to be slower than for the stack, but can be used for larger datasets. - - Cache: Very small pieces of memory designed to be very fast. Cache structure is hardware dependent, but three levels of caching is typical, ranging from kB to MB (with the smallest cache layer being fastest to access). Cache memory stores chunks of data from locations accessed recently in memory. + - This includes on-chip RAM and ROM. + - ROM is permanent, and read-only, and generally not of much interest to software developers since it handles things like instructions for basic I/O and loading the operating system: things that we don't (and can't) mess with! + - RAM is generally volatile memory, meaning that it requires power to be maintained: if you turn off your computer you will typically lose what is in RAM. Usually on the scale of a few GB. Contains the current program and data in use. + - Stack: Stack memory is usually $\lesssim 10$ MB, assigned by the operating system when the program is launched. Stack memory cannot be increased while the program is running. Contains data for currently open scopes i.e. the function currently being executed and any hierarchy of functions which are calling it and therefore have not terminated yet. Very large pieces of data or very deep call trees (e.g. excessively deep recursion) can cause a _stack overflow_, where the stack runs out of memory. Stack memory is generally faster than Heap memory. + - Heap: The Heap is a larger pool of memory, also assigned by the operating system at the program launch, but the heap size allocated to a program can grow dynamically as long as there is enough space left in RAM. Memory access tends to be slower than for the stack, but can be used for larger datasets. + - Cache: Very small pieces of memory designed to be very fast. Cache structure is hardware dependent, but three levels of caching is typical, ranging from kB to MB (with the smallest cache layer being fastest to access). Cache memory stores chunks of data from locations accessed recently in memory. This structure has implications for our understanding of memory bound problems: -- Problems which use very large datasets may be more likely to be memory bound, as larger data stores are less efficient to access. Accesses to large stores should be minimised, and data being worked on should be moved to faster memory. +- Problems which use very large datasets may be more likely to be memory bound, as larger data stores are less efficient to access. Accesses to large stores should be minimised, and data being worked on should be moved to faster memory. - Algorithms which jump around erratically in memory also result in slower memory accesses. Architectures are usually optimised for sequential or localised accesses to some extent, for example memory addresses close to those recently accessed are more likely to be in the cache. -- Working on a piece of data as fully as possible before accessing another location will limit the number of memory accesses over all. +- Working on a piece of data as fully as possible before accessing another location will limit the number of memory accesses over all. -## Cache structure +## Cache structure -It's not always possible to limit the number of memory accesses that we make, but we may be able to make choices about our memory access patterns to maximise our usage of our fastest memory. In this case, we'll consider an example where our object is stored entirely in on-chip RAM, but we want to make effective use of the cache. First, we'll need to understand a bit about how the cache works. +It's not always possible to limit the number of memory accesses that we make, but we may be able to make choices about our memory access patterns to maximise our usage of our fastest memory. In this case, we'll consider an example where our object is stored entirely in on-chip RAM, but we want to make effective use of the cache. First, we'll need to understand a bit about how the cache works. -- When data is requested, the cache is checked first to see if it is present. If it is, it can take the data straight from the cache, which is much faster than going into RAM. -- If there are multiple cache levels, it searches from the smallest (and fastest) cache to the largest (and slowest). -- If it's not is any cache (a cache "miss") it will fetch the value from RAM. -- When data is looked up in system memory, that data is stored in the cache. - - If the cache is full or there is data already in the location that the system wants to cache the new data, then some data in the cache is overwritten. (Where the data is cached and therefore which data is overwritten depends on the cache-mapping strategy and will be hardware dependent.) -- Data is added to the cache in blocks of a fixed size (which is hardware dependent). If the variable we wanted to read is smaller than this block size then some neighbouring data will end up in the cache as well. - - If we read an element from an array or vector for example, which store data contiguously, that means that some surrounding elements will also end up in the cache. - - We can then read close by elements from the cache quickly without system memory accesses until you reach the limits of the copied block! +- When data is requested, the cache is checked first to see if it is present. If it is, it can take the data straight from the cache, which is much faster than going into RAM. +- If there are multiple cache levels, it searches from the smallest (and fastest) cache to the largest (and slowest). +- If it's not is any cache (a cache "miss") it will fetch the value from RAM. +- When data is looked up in system memory, that data is stored in the cache. + - If the cache is full or there is data already in the location that the system wants to cache the new data, then some data in the cache is overwritten. (Where the data is cached and therefore which data is overwritten depends on the cache-mapping strategy and will be hardware dependent.) +- Data is added to the cache in blocks of a fixed size (which is hardware dependent). If the variable we wanted to read is smaller than this block size then some neighbouring data will end up in the cache as well. + - If we read an element from an array or vector for example, which store data contiguously, that means that some surrounding elements will also end up in the cache. + - We can then read close by elements from the cache quickly without system memory accesses until you reach the limits of the copied block! -Taking advantage of these blocks of memory in the cache is the key to writing efficient memory bound algorithms: if we use them wisely we can avoid a lot of calls to system memory and replace them with much quicker calls to the cache. +Taking advantage of these blocks of memory in the cache is the key to writing efficient memory bound algorithms: if we use them wisely we can avoid a lot of calls to system memory and replace them with much quicker calls to the cache. -### Using the Cache effectively +### Using the Cache effectively We know now that: -- Reading the same memory address (e.g. accessing the same variable), or reading nearby memory addresses (e.g. elements in a vector) is faster than jumping around in memory. - - This suggests that we should break problems down into sizes that will fit in the cache, and then work on them until we don't need that data any more before moving on (if we can). -- The structure and size of the cache, and the size of the blocks loaded into the cache from memory, are all system dependent. - - This suggests that over-optimising for the cache is a bad idea: if we design code especially for the specifications of the cache on our machine, it will not be very portable to other machines! We should try to make algorithms that will exploit the cache well but are ideally not dependent on the exact size. +- Reading the same memory address (e.g. accessing the same variable), or reading nearby memory addresses (e.g. elements in a vector) is faster than jumping around in memory. + - This suggests that we should break problems down into sizes that will fit in the cache, and then work on them until we don't need that data any more before moving on (if we can). +- The structure and size of the cache, and the size of the blocks loaded into the cache from memory, are all system dependent. + - This suggests that over-optimising for the cache is a bad idea: if we design code especially for the specifications of the cache on our machine, it will not be very portable to other machines! We should try to make algorithms that will exploit the cache well but are ideally not dependent on the exact size. An algorithm which exploits the cache but which does not depend on the exact details of the cache is called a _cache oblivious algorithm_. Some good patterns for cache oblivious algorithms include: - Tiling: breaking the problem into small chunks. -- Recursion can be a good way to make your solution cache oblivious. Recursion expresses the solution in terms of solutions to smaller sub-problems, down to a base case. The cache will start to be used effectively once the size of the sub-problems start to fit inside the cache, which means you don't have to tune the algorithm to the size of the cache to take advantage of it. -- Stencil algorithms are algorithms which calculate a value at a data point based on the values around it in a grid (common in simulations of e.g. flows) and fit naturally into efficient memory structures provided the stencil moves sensibly through memory. +- Recursion can be a good way to make your solution cache oblivious. Recursion expresses the solution in terms of solutions to smaller sub-problems, down to a base case. The cache will start to be used effectively once the size of the sub-problems start to fit inside the cache, which means you don't have to tune the algorithm to the size of the cache to take advantage of it. +- Stencil algorithms are algorithms which calculate a value at a data point based on the values around it in a grid (common in simulations of e.g. flows) and fit naturally into efficient memory structures provided the stencil moves sensibly through memory. - Rearrange data in memory to fit your access patterns. For example a matrix may be stored with elements in the same row next to each other (row major order) _or_ with elements in the same column next to each other (column major order). Accessing memory in sequence will take advantage of your cache well regardless of the size of your cache. - -## Efficiently Cached Algorithm Example: Matrix Transposition +## Efficiently Cached Algorithm Example: Matrix Transposition Let's take a look again at our example of a memory bound problem, matrix transposition, and see how it can be impacted by good and bad use of the cache. Let's start with our simple matrix transpose code and see how it might behave: @@ -106,61 +105,62 @@ void Transpose(vector> &A, vector> &B) } } ``` -We'll assume that our matrices are in row major order, so rows in each matrix are contiguous in memory, and we will be focusing just on reading the data from the source matrix, and ignoring writing the operations to the output matrix, since the output matrix will be filled in order so that part of the algorithm is already cache efficient. (If they were in column major order the logic would be the same except exchanging write for read: reading the source matrix would be cache efficient, but writing the output matrix woudl be inefficient.) -This is an illustrative example using a single cache of very small capacity; we won't concern ourselves with the exact cache-mapping strategy since this varies, but will just fill in our cache in order. In the diagrams _red_ blocks will be blocks in system memory but not in the cache, and _blue_ blocks are data which are also stored in the cache. +We'll assume that our matrices are in row major order, so rows in each matrix are contiguous in memory, and we will be focusing just on reading the data from the source matrix, and ignoring writing the operations to the output matrix, since the output matrix will be filled in order so that part of the algorithm is already cache efficient. (If they were in column major order the logic would be the same except exchanging write for read: reading the source matrix would be cache efficient, but writing the output matrix would be inefficient.) + +This is an illustrative example using a single cache of very small capacity; we won't concern ourselves with the exact cache-mapping strategy since this varies, but will just fill in our cache in order. In the diagrams _red_ blocks will be blocks in system memory but not in the cache, and _blue_ blocks are data which are also stored in the cache. 1. The first element we read is `A[0][0]`. Our cache is empty at the moment so this results in a cache miss. -![image](img/CacheTranspose1.png) + ![image](img/CacheTranspose1.png) -2. The block of data containing `A[0][0]` is therefore read from RAM and copied into the cache, which now also contains `A[0][1]` and `A[0][2]` etc. +2. The block of data containing `A[0][0]` is therefore read from RAM and copied into the cache, which now also contains `A[0][1]` and `A[0][2]` etc. -![image](img/CacheTranspose2.png) + ![image](img/CacheTranspose2.png) -3. The next value we read is `A[1][0]`. This also results in a cache miss if the matrix is too large for more than one row to fit in a single block in the cache (which is likely as cache blocks are very small). +3. The next value we read is `A[1][0]`. This also results in a cache miss if the matrix is too large for more than one row to fit in a single block in the cache (which is likely as cache blocks are very small). -![image](img/CacheTranspose3.png) + ![image](img/CacheTranspose3.png) 4. The block containing `A[1][0]`, `A[1][1]` ... is copied to the cache. -![image](img/CacheTranspose4.png) + ![image](img/CacheTranspose4.png) -5. This sequence of cache misses and copies continues and eventually the cache is filled. +5. This sequence of cache misses and copies continues and eventually the cache is filled. -![image](img/CacheTranspose5.png) + ![image](img/CacheTranspose5.png) -6. When we try to read the next element, we once again have a cache miss, but now in order to add it into the cache we must replace an earlier entry. +6. When we try to read the next element, we once again have a cache miss, but now in order to add it into the cache we must replace an earlier entry. -![image](img/CacheTranspose6.png) + ![image](img/CacheTranspose6.png) -7. Eventually we will have read through the entire first column, and will start on the second column to read `A[0][1]`. This was added into our cache in step 2, but if the matrix is sufficiently large (or if there are clashes because of the cache-mapping strategy) then by the time we return to read this value it will have been overwritten in the cache, resulting in yet another cache miss! +7. Eventually we will have read through the entire first column, and will start on the second column to read `A[0][1]`. This was added into our cache in step 2, but if the matrix is sufficiently large (or if there are clashes because of the cache-mapping strategy) then by the time we return to read this value it will have been overwritten in the cache, resulting in yet another cache miss! -![image](img/CacheTranspose7.png) + ![image](img/CacheTranspose7.png) -8. This process continues on for the whole matrix, in this case missing the cache and making a call to system memory for every single element. Since this problem is clearly memory bound, this will have a large impact on the performance of this algorithm by slowing down all of our memory accesses. +8. This process continues on for the whole matrix, in this case missing the cache and making a call to system memory for every single element. Since this problem is clearly memory bound, this will have a large impact on the performance of this algorithm by slowing down all of our memory accesses. -![image](img/CacheTranspose8.png) + ![image](img/CacheTranspose8.png) We can solve this problem by dividing our matrix up into smaller sub-matrices which do fit into the cache. In this toy example where we only have four slots in our cache, we'll just transpose the $4 \times 4$ sub-matrix $A_{0 ... 3, 0...3}$. (In reality you can store more than this in a cache but then the diagram would get very cluttered indeed!) 1. The algorithm will start as before, with a series of cache misses. -![image](img/CacheTranspose9.png) + ![image](img/CacheTranspose9.png) 2. However in this case we stop moving down the column before we overwrite any existing matrix data in our cache. So when we come to read `A[0][1]` it is still present in the cache! In fact the rest of this small matrix is cached, so we proceed with 12 cache hits after our first four cache misses: a major improvement in memory performance! -![image](img/CacheTranspose10.png) + ![image](img/CacheTranspose10.png) -3. We can then repeat this process for each small sub matrix within our main matrix and achieve the same ratio of hits to misses throughout. +3. We can then repeat this process for each small sub matrix within our main matrix and achieve the same ratio of hits to misses throughout. ## Optional note: Virtual Memory and Paging -Memory addresses used by pointers in C++ are in fact pointers to _virtual memory_: an abstract model of memory, but not the _physical_ memory addresses themselves. This is important because the memory used by a program is actually set by the operating system (remember that your program is assigned stack and heap memory at the start), so in order for our program to work regardless of what memory space we're given we can't refer to explicit physical memory addresses. Instead it has to refer to these virtual memory addresses which are then translated into the real memory addresses by the OS. This can have some consequences because addresses which are contiguous in _virtual memory_ are not necessarily contiguous in physical memory! +Memory addresses used by pointers in C++ are in fact pointers to _virtual memory_: an abstract model of memory, but not the _physical_ memory addresses themselves. This is important because the memory used by a program is actually set by the operating system (remember that your program is assigned stack and heap memory at the start), so in order for our program to work regardless of what memory space we're given we can't refer to explicit physical memory addresses. Instead it has to refer to these virtual memory addresses which are then translated into the real memory addresses by the OS. This can have some consequences because addresses which are contiguous in _virtual memory_ are not necessarily contiguous in physical memory! -Memory (in RAM or on disk) is generally _paged_, which means stored in blocks of a particular size (4kB is common). Pages in virtual memory can be translated into pages in physical memory, with some overhead to resolve the page location and usually some latency to access it (which will depend on the kind of memory you are accessing). If a an area of virtual memory, for example storage of a vector, crosses more than one page, these pages may not be contiguous in physical memory (even if they are in virtual memory). +Memory (in RAM or on disk) is generally _paged_, which means stored in blocks of a particular size (4kB is common). Pages in virtual memory can be translated into pages in physical memory, with some overhead to resolve the page location and usually some latency to access it (which will depend on the kind of memory you are accessing). If a an area of virtual memory, for example storage of a vector, crosses more than one page, these pages may not be contiguous in physical memory (even if they are in virtual memory). -If your data is not well aligned with the pages, then you can end up doing unnecessary additional work to resolve extra pages. Similar to how cache efficient algorithms work, some algorithms (such as B-trees, see the _Introduction to Algorithms_ book in the recommended texts for a great discussion of these!)) which deal with very large data on disk will work with one page at a time to minimise hopping from page to page. Sometimes alignment is even more important, as some accelerated devices require memory to be aligned with the pages in order to be streamed to / from the device. If the memory is not aligned, it can be copied into a new, aligned memory location which is expensive for large datasets. Page resolutions can also be made more efficient if we can force memory to be allocated contiguously in _physical memory_, which can also be useful for streaming to such devices. +If your data is not well aligned with the pages, then you can end up doing unnecessary additional work to resolve extra pages. Similar to how cache efficient algorithms work, some algorithms (such as B-trees, see the _Introduction to Algorithms_ book in the recommended texts for a great discussion of these!)) which deal with very large data on disk will work with one page at a time to minimise hopping from page to page. Sometimes alignment is even more important, as some accelerated devices require memory to be aligned with the pages in order to be streamed to / from the device. If the memory is not aligned, it can be copied into a new, aligned memory location which is expensive for large datasets. Page resolutions can also be made more efficient if we can force memory to be allocated contiguously in _physical memory_, which can also be useful for streaming to such devices. -If strictly necessary, these memory properties can be forced by using OS specific commands, although standard C++ does have methods for declaring aligned memory. -If you are interested, [here is an example for FPGAs](https://xilinx.github.io/Vitis-Tutorials/2022-1/build/html/docs/Hardware_Acceleration/Introduction/runtime_sw_design.html), a kind of accelerated hardware, with a discussion of these concepts and how to address them. \ No newline at end of file +If strictly necessary, these memory properties can be forced by using OS specific commands, although standard C++ does have methods for declaring aligned memory. +If you are interested, [here is an example for FPGAs](https://xilinx.github.io/Vitis-Tutorials/2022-1/build/html/docs/Hardware_Acceleration/Introduction/runtime_sw_design.html), a kind of accelerated hardware, with a discussion of these concepts and how to address them. diff --git a/08openmp/01_parallel_programmning.md b/08openmp/01_parallel_programming.md similarity index 100% rename from 08openmp/01_parallel_programmning.md rename to 08openmp/01_parallel_programming.md diff --git a/08openmp/03_fractal_example.md b/08openmp/03_fractal_example.md index 41c595182..72a6addc7 100644 --- a/08openmp/03_fractal_example.md +++ b/08openmp/03_fractal_example.md @@ -1,5 +1,5 @@ --- -title: "Example 10: Generating a fractal" +title: "Example: Generating a fractal" --- ## Example 10: Generating a fractal diff --git a/08openmp/04_cache_performance.md b/08openmp/04_cache_performance.md new file mode 100644 index 000000000..0749cbc23 --- /dev/null +++ b/08openmp/04_cache_performance.md @@ -0,0 +1,184 @@ +--- +title: "Cache Performance in Shared Memory" +--- + +# Cache Performance in Shared Memory + +The need for cache efficiency hasn't gone away just because we've started parallelising things; in fact, it may be more important than ever! Generally for distributed systems we just need to worry about the cache efficiency of each process in isolation, but if memory is shared then that means our cache gets shared too. The way that our cache behaves when shared is a little different though, so we'll need to re-think how we do things a bit. + +As always with the memory system, things will be system dependent, but on a typical CPU: + +- General RAM and the largest cache level will be shared between cores which share memory i.e. there + is just one physical RAM and one large physical cache (in my case just the L3 cache) which is + accessed by all cores. (Not all cores necessarily access memory with equal bandwidth or latency + though.) +- Each core will have its own copy of the smallest cache level(s), in my case it's the L1 and L2 caches. +- This keeps access to the small caches quick, but also enforces the need to consistency between the + copies of the caches. + + - If I have two cores $C_1$ and $C_2$ which both store a copy of variable `x` in their L1 cache, then when they want to read `x` from memory they just read it from the cache and not from RAM. Likewise when they want to _write_ to `x`, they write to the cache but not RAM. + - If $C_1$ changes `x`, it will change the value of `x` in _its own cache_, but not in the $C_2$ cache. + - In order for $C_2$ to read the correct value of `x` after the update, it has to find out about the change somehow. + - This mechanism will be system dependent but typically it will involve something written to a + special shared area (possibly part of the L3 cache) when $C_1$ updates `x`. $C_2$ needs to + check this and if `x` has been changed it needs to get the new value of `x` which will need to + be copied over, incurring additional overheads. + +- Remember that the cache stores data in blocks of a given size, called "cache lines". The cache lines on my machine for example are 64 bytes or 8 doubles wide. +- If two L1 caches on different cores both store the same cache line, and one of the cores changes + _any value in that line_, then **the entire cache line is invalidated for the other core**. + + - This is extremely important as it means that even if the two cores never operate on the same values, if the values that they operate on are next to each other in memory and stored in the same cache line, then they will still invalidate one another's caches and cause lookups to have to be made. + +As a very simple example, let's look at a possible manual implementation of the reduction code shown in last week's class. I'll do this example with no compiler optimisations to prevent the compiler optimising away any memory read/write operations so we have full control! As a reminder, the basic code looks like this: + +```cpp +#include +#include +#include +#include "timer.hpp" +#include + +using namespace std; + +int main() { + const int num_threads = omp_get_max_threads(); + + double sum = 0.; + + Timer timer; + +#pragma omp parallel for + for(long i=0; i -#include -#include -#include "timer.hpp" -#include - -using namespace std; - -int main() { - const int num_threads = omp_get_max_threads(); - - double sum = 0.; - - Timer timer; - -#pragma omp parallel for - for(long i=0; i In the example above, if the computation `b` takes a long time then the computation `c`, `d`, and `e` could all complete and the computation `f` will be waiting for `b` even though the other chains are longer. Usually however we are dealing with chains of computations often representing the same operations (such as arithmetic operations) so the limiting factor becomes the longest chain, but this is not always the case and the circuit diagram just represents the dependencies of each piece of the computation. + +As with our analysis of serial algorithms, it is more useful to use Big-O notation to look at how parallelism _scales_, in which case constant factors become irrelevant. + +Let's look at some concrete examples to understand what we mean. We will look at three related operations on lists: _map_, _reduce_, and _scan_. + +## Map + +Map applies a (pure) function to every element of a list: + +$\text{map} (f, [x_0, ..., x_n]) = [f(x_0), ..., f(x_n)]$. + +We can see that every element of the output is independent of every other, since each only depends on one element of input. + +![image](images/MapCircuit.png) + +- $W \in O(n)$ + - The work is just proportional to the number of elements because we do one computation per element. +- $D \in O(1)$ + - The depth does not scale with the size of the input because all elements can be processed in parallel. + +Constant depth is a feature of so called "embarrassingly parallel" problems, where all computations are independent and you can just throw more computing power at them to speed them up. (With come caveats, but this is an idealised algorithm analysis!) + +## Reduce + +Reduce is used to calculate things like sums or products of lists. It applies an associative binary operation to the first pair of elements, and then takes the result and applies the binary operation to that result and the next element, and so on until the end of the list until we're left with a single value. For a binary operator $\bigoplus$: + +$\text{Reduce}\,(\bigoplus, [x_0, x_1, ..., x_n]) = x_0 \bigoplus x_1 \bigoplus x_2 \bigoplus ... \bigoplus x_n$, + +or, in terms of a binary function $f$: +$\text{Reduce}\,(\bigoplus, [x_0, x_1, ..., x_n]) = f(f(...f(f(x_0, x_1), x_2)...), x_n)$, + +- Calculating a reduction with $+$ gives the sum of a list, and with $\times$ gives the product of a list. + +The simplest approach to a serial reduction would look like this: +```cpp +for(const auto &x : v) +{ + total = f(total, x); +} +``` +- This has a **loop dependency**, where each iteration of the loop depends on the result of the previous. +- This is why `parallel for` doesn't work well. Instead we have a `reduce` pragma in openMP. Why is this? + +The trick is that our binary operator is **associative**: +- This means we can change the order that we apply the operator. +- We can't change the order of the elements in the list though; that's _commutativity_. + +The loop dependency is a consequence of the way that the code was written, not the problem itself. With an associate operator, we can instead choose to do pairwise operations (e.g. summing a list by pairwise additions), as shown in the diagram below. + +![image](images/ReduceCircuit.png) + +- $W \in O(n)$ + - The work is linear in the size of the input list because there is still exactly $n-1$ applications of the operator. +- $D \in O(\log n)$ + - The depth is $\log n$ because the number of operators to be applied halves at each level of the diagram. + +This kind of tree diagram is a common data dependency pattern, and places some limitations on the speed-up of our algorithm compared to our embarrassingly parallel problem. Even with infinite processors, we still can't do better than $O(\log n)$ serial computations! + +We can also see as we move down the tree that we have fewer operations to do in parallel at each stage. Depending on the size of this tree and the number of processors that you have, this means processing power may end up sitting idle which could be reallocated to other tasks elsewhere while this computation is still going on. (This isn't really going to be the case for something as rapid as an addition, but for workflows with similar tree like structures where computations take a long time, you can end up with resources sitting idle for significant amounts of time as you move down the tree.) + +> Floating point `+` and `*` is, unlike real $+$ and $\times$, actually non-associative due to the representation issues that we talked about in week 7. We actually have less errors when performing additions pairwise than when performing additions sequentially, because if the inputs are each of similar size then the partials sums will all remain similar size as well. So this pair-wise approach is both better for parallelism **and** better for precision! + + +> You may also see `foldl` and `foldr`. These are variations for non-associative operators: +$\text{foldl} \, (\bigoplus, [x_0, x_1, ..., x_n]) = (...((x_0 \bigoplus x_1) \bigoplus x_2) \bigoplus ... \bigoplus x_n) $ +$\text{foldr} \, (\bigoplus, [x_0, x_1, ..., x_n]) = (x_0 \bigoplus (x_1 \bigoplus (x_2 \bigoplus (... \bigoplus x_n)))...) $ +These data dependencies don't parallelise well because the order of application of the binary operator is fixed! + +## Scan + +Scan is similar to reduce, but instead of returning a single scalar result it returns a list containing the cumulative result, for example a cumulative sum or cumulative product. Applying a Scan with $+$ to the list $[1,2,3,4,5]$ for example would give $[1,3,6,10,15]$. + +**Note that we have to be more careful of our order of operations now, since we need to find not just the total, but also the well defined sequence of sums from left to right.** + +This is another case where it _looks_ like we have a severe loop dependency which would force us to work in series, but we can in fact parallelise this algorithm as well, _at the cost of repeating some work_. + +- It is not uncommon for parallel algorithms to have to repeat work in order to increase the independence of computations to allow for them to be done in parallel. This is where a more detailed work-depth analysis can be useful, to understand the trade offs that we expect to see. + +To find a parallel solution to this problem, let's start by using the same pairwise operations that we used for the reduction. We'll look at a summation with 8 elements, and rearrange the diagram so that the outputs that are already computed are clearly shown. + +![image](images/PartialScan.png) +- Inputs are marked $x_0 ... x_7$ +- Outputs are marked $S_0 ... S_7$ + +We can see that the output indices 0, 1, 3, and 7 are computed as part of the pairwise sum process. + +We can see that there are gaps in our cumulative sum that need to be computed. We can also see that the gaps get progressively larger; in fact they double in size each time. (We can see the spacing double clearly by noting that in general indices $2^i - 1$ will be calculated by pairwise sums.) + +The trick to this algorithm is to work out how to fill in the gaps. Let's start by filling in index 2, the first gap in our sequence. + +![image](images/PartialScan2.png) + +- We need to add one more addition, which reuses our result for $S_1$ and adds $x_2$. + +Now let's try filling in the bigger gap between $S_3$ and $S_7$. We don't want to fill this in sequentially because then we'll just be back in the territory of algorithms and we'll end up with $O(n)$ depth. Instead, we want to construct a similar tree that we had before for the pairwise sum, but now with increasing nodes at each step. + +![image](images/PartialScan3.png) +- Note that when calculating $S_6$ we've re-used the result that we computed for $S_5$. +- We can also spot a pattern here: there is a tree structure in the lower half of this graph for each gap that gets filled. + - At the lowest level, half of the outputs have an additional computation node. + - Above that, only half again need an extra node. + - This pattern continues up for larger gaps to fill in. + +So to do the parallel scan we find that we have 2 tree structures: +- Each does $O(n)$ work +- Each has depth $O(\log n)$ + +As a result this algorithm has: +- $W \in O(n)$ +- $D \in O(\log n)$ + +So we can substantially improve the depth (and therefore time) over a serial algorithm with sufficient processing power, but we do approximately _double the total work_ of the serial approach. + +- As a result of the extra work done, having 2 processors tackle this job (using this approach) is unlikely to be very effective: the time you save doing things in parallel would be roughly cancelled out by all the duplicate work you're doing. +- With four processors we might expect to get a benefit of roughly a factor of 2 on a large list (so most of our time is spent doing parallel computations), because we'll be doing twice as much work with four times the processing power. + + +# Approaches to Parallel Algorithms + +## Divide and Conquer + +We've talked a little about divide and conquer algorithms in the past, for example in our discussion of _merge sort_ in week 7. These algorithms express the solution of a problem in terms of the results from some independent sub-problems. This naturally leads to a parallelised approach with a tree like structure. + +- We can once again use recurrence relations and the Master Theorem from complexity theory to understand the depth of these algorithms. + +## Pipelining + +Pipelining is an approach to task based parallelism where we process a series of data as each piece of data arrives. +- Data is provided serially + - This could be from an input stream e.g. series of real-time measurements. + - Could be produced by some inherently serial computation that can't be paralallelised. + - Even when reading from memory we can only read a small amount at a time, so sometimes pipelining can be an effective tactic to overlap the memory read/write overheads with the computation itself. +- We start processing each piece of data as it arrives. +- We process data in parallel, but pieces of data will be at different stages of the computation. + - The "pipeline" refers to the fact that the data is being pushed through a series of computations, with each piece of data following another rather than overlapping. + +The simplest scenario is when the processing of different pieces of data are independent. In order to visualise the benefits of pipelining, it is better to use a different kind of diagram which takes time into account explicitly. + +We'll split time into discrete units (for high performance applications this may be individual CPU clock cycles!); in the diagram below time runs along the horizontal direction with each unit of time being shown in the boxes at the top of the diagram. For this example we'll say that we **receive one piece of data in each unit of time**. + +![image](images/Pipeline.png) + +- The coloured boxes represent computations which can be performed. Boxes that overlap in the vertical direction happen concurrently. +- `Get` represents retrieving data from our stream. Because data input is sequential these can't overlap with one another, but they _can_ overlap with data processing. +- Each piece of data read is fed through the series of functions $f$, $g$, $h$. These overlap in time for different data elements but with an offset. + +The time to process a stream of $N$ pieces of data is: + +$T = t_0 + (N-1) \times I $ +where: +- $t_0$ is the time to process one piece of pieces of data. In our example, this is the time for `Get`, $f$, $g$, and $h$, which we can see from the diagram is 9 of our time units. +- $I$ is the "initiation interval", which is the time delay between starting to process one piece of data and beginning on the next. In this example it is 1, which is the ideal, although there are times when data dependencies can force this to be larger than 1. + +We can see how to arrive at this formula because due to the overlap of our computations, after finishing processing the first piece of data ($t_0$), we only need to take another $I$ units of time for each additional piece of data ($N-1$ in total). + +- Pipelines can be implemented using threads on multi-core CPUs. +- Pipelines are a key form of data parallelism for devices such as ASICs and FPGAs. +- They are an extremely efficient way of moving from serial processes into parallel computations. + +### Comparison with Batching + +Another approach to processing data in parallel is to load as much data as you can parallelise over, and then process this data all at the same time instead of using the staggered approach of pipelining. You'd then end up with a workflow that looks like the one below: + +![image](images/Batch1.png) + +This is the kind of approach that is used for devices such as GPUs. + +If you have enough parallel channels to process _all_ of your elements in parallel, then this gives the same total execution time as the perfect pipelining solution: just the time to load all data plus the time to process one element. + +The difference between these approaches comes when you have more data than you have parallel channels. + +- Perfect pipelining only requires as many parallel channels as there can be data elements overlapping. If we look at our pipelining diagram above, our processing ($f$, $g$, and $h$) takes 9 units of time, and we get a new element every time unit, so we can only 9 data processes will overlap before the first channel is free to receive data again. This means _no matter how many data elements are in our stream_ we can achieve our best pipelining performance with just 9 parallel channels in this example. + - This makes pipelines extremely powerful for optimising parallel performance with minimal resources, especially when dealing with very long or potentially indefinite streams. +- When batching, if there are not enough channel to process everything in one parallel batch, then we need to run multiple processing steps, which will add time. If we have $P$ parallel channels and $N$ data elements, we will have to take at least $\frac{N}{P} t_0$ steps to process our data. + - Depending on the device you use you may be able to keep the time down by reading the next batch of data while processing the current one. + +**One should note however that devices such as GPUs manage to be very fast by having both a very large number of parallel channels and very fast hardware. These kinds of considerations can be more important than pure parallelism which only analyses what can be done at the same time: always consider the time things would take in the real world!** \ No newline at end of file diff --git a/10parallel_algorithms/images/Batch1.png b/10parallel_algorithms/images/Batch1.png new file mode 100644 index 000000000..9eb020b2e Binary files /dev/null and b/10parallel_algorithms/images/Batch1.png differ diff --git a/10parallel_algorithms/images/MapCircuit.png b/10parallel_algorithms/images/MapCircuit.png new file mode 100644 index 000000000..9e9ffd3b1 Binary files /dev/null and b/10parallel_algorithms/images/MapCircuit.png differ diff --git a/10parallel_algorithms/images/PartialScan.png b/10parallel_algorithms/images/PartialScan.png new file mode 100644 index 000000000..f9189c53e Binary files /dev/null and b/10parallel_algorithms/images/PartialScan.png differ diff --git a/10parallel_algorithms/images/PartialScan2.png b/10parallel_algorithms/images/PartialScan2.png new file mode 100644 index 000000000..751588e58 Binary files /dev/null and b/10parallel_algorithms/images/PartialScan2.png differ diff --git a/10parallel_algorithms/images/PartialScan3.png b/10parallel_algorithms/images/PartialScan3.png new file mode 100644 index 000000000..c8290d2ff Binary files /dev/null and b/10parallel_algorithms/images/PartialScan3.png differ diff --git a/10parallel_algorithms/images/Pipeline.png b/10parallel_algorithms/images/Pipeline.png new file mode 100644 index 000000000..8f79e013e Binary files /dev/null and b/10parallel_algorithms/images/Pipeline.png differ diff --git a/10parallel_algorithms/images/ReduceCircuit.png b/10parallel_algorithms/images/ReduceCircuit.png new file mode 100644 index 000000000..0ee816c20 Binary files /dev/null and b/10parallel_algorithms/images/ReduceCircuit.png differ diff --git a/10parallel_algorithms/images/SimpleCircuit.png b/10parallel_algorithms/images/SimpleCircuit.png new file mode 100644 index 000000000..601e77d22 Binary files /dev/null and b/10parallel_algorithms/images/SimpleCircuit.png differ diff --git a/10parallel_algorithms/index.md b/10parallel_algorithms/index.md new file mode 100644 index 000000000..228bb8791 --- /dev/null +++ b/10parallel_algorithms/index.md @@ -0,0 +1,9 @@ +--- +title: "Week 10: Work Depth Models and Parallel Strategies" +--- + +## Week 10: Overview + +This week we'll take a deeper look at how we quantify how parallelisable algorithms are, and discuss some broad strategies for tacking parallel problems. These approaches apply to both shared and dsitributed memory systems. + +1. [Work Depth Models](WorkDepth.html) diff --git a/_config.yml b/_config.yml index 3fc1a0f4b..b0e827188 100644 --- a/_config.yml +++ b/_config.yml @@ -1,8 +1,8 @@ contact: http://www.ucl.ac.uk/research-it-services/partner-with-us -title: "PHAS0100: Research Computing with C++" +title: "COMP0210: Research Computing with C++" includeleftnav: True -short: PHAS0100 +short: COMP0210 defaults: - scope: {path: ""} @@ -34,11 +34,6 @@ idio: include: [ _static, _sources] exclude: - 05libraries/sec05CMake*.md - - 02cpp1/sec04*.md - - 02cpp1/sec05*.md - - 02cpp1/sec06*.md - - 02cpp1/sec07*.md - - 02cpp1/sec03TemplatesIntro.md - 03cpppatterns - 04HPC - 05OpenMP diff --git a/index.md b/index.md index c43c61316..9071713a2 100644 --- a/index.md +++ b/index.md @@ -19,7 +19,7 @@ We have found in previous years that C++ is no longer commonly taught at undergr * Arrays and structures * Basic object oriented design (classes, inheritance, polymorphism) -This could be obtained through online resources such as the the C++ Essential Training course by Bill Weinman on [LinkedIn Learning](https://www.ucl.ac.uk/isd/linkedin-learning) (accessable using your UCL single sign-on) or via a variety of C++ courses in college, such as [MPHYGB24](https://moodle.ucl.ac.uk). +This could be obtained through online resources such as the the C++ Essential Training course by Bill Weinman on [LinkedIn Learning](https://www.ucl.ac.uk/isd/linkedin-learning) (accessible using your UCL single sign-on) or via a variety of C++ courses in college, such as [MPHYGB24](https://moodle.ucl.ac.uk). * Eligibility: This course designed for UCL post-graduate students but with agreement of their course tutor a limited number of undegraduate students can also take it. @@ -27,4 +27,4 @@ This could be obtained through online resources such as the the C++ Essential Tr Members of doctoral training schools, or Masters courses who offer this module as part of their programme should register through their course organisers. -This course may not be audited without the prior permission of the course organiser Dr. Jamie Quinn as due to the practical nature of the lectures there is a cap on the total number of students who can enrol. +This course may not be audited without the prior permission of the course organiser Dr. Jamie Quinn as due to the practical nature of the lectures there is a cap on the total number of students who can enrol.