Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation episode #3

Merged
merged 1 commit into from
Jun 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ contact: '[email protected]'

# Order of episodes in your lesson
episodes:
- 2-documentation.md
- 3-testing.md
- 4-ci.md

Expand Down
247 changes: 247 additions & 0 deletions episodes/2-documentation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,247 @@
---
title: Document your research software
teaching: 25
exercises: 30
---

::::::::::::::::::::::::::::::::::::::: objectives

- Know what makes a good documentation
- Learn what tools can be used for writing documentation
- Be able to motivate a balanced decision: sometimes READMEs are absolutely enough

::::::::::::::::::::::::::::::::::::::::::::::::::

:::::::::::::::::::::::::::::::::::::::: questions

- What can I do to make my code more easily understandable?
- What information should go into comments?
- What are docstrings and what information should go into docstrings?

::::::::::::::::::::::::::::::::::::::::::::::::::

:::instructor
Use [these slides](https://lyashevska.github.io/ds-cr-slides/documentation/) as
a guidance.

The main purpose of this lesson is to make sure participants understand that DOCUMENTATION IS IMPORTANT. The goal is more to trigger participants then to teach them all the different ways one could document a project. It is good to communicate this (and that this will give more time for the other parts of the workshop).

:::

## Why we teach this lesson

Specific motivations:

- Code documentation becomes quickly unmanageable if not part of the source code.
- It helps people to quickly use your code thus reducing the time spent to explain over and again to new users.
- It helps people to collaborate.
- It improves the design of your code.

## What makes a good documentation?

::: challenge
### Exercise: Think of good and bad examples
Write down your thoughts in the collaborative documents.
Respond with emojis :+1: :scream_cat: to your colleagues' answers.
- Think of projects of which you like the documentation. What do you like about them?
- Think of projects for which you don’t like the documentation. What don’t you like about them? Are you missing anything?

**NB: You can choose a mature library with lots of users for this exercise, but try to also think of less mature projects you had to collaborate on, or papers you had to reproduce.**

:::: solution
- It is important to document code
- Think about the people reading your documentation (your audience)
- Depending on the purpose and state of the project documentation needs to meet different criteria. 
- For most scientific projects, in-code documentation and a well thought out README file is enough.
- Documentation should be tracked with corresponding code
::::
:::

## Types of documentation
There are different types of documentation:

- README files
- in-code documentation
- API documentation
- Tutorials

We will discuss a few of them in more depth.

## Writing good README files
The README file is the first thing a user/collaborator sees. It should include:

- A descriptive project title
- Motivation (why the project exists)
- How to setup
- Copy-pastable quick start code example
- Link or instructions for contributing
- Recommended citation

::: challenge
### Exercise README: Draft or improve a README for one of your recent projects (in breakout rooms)

Try to draft a brief README or review a README which you have written for one of your projects.

You can work individually, but you could also discuss whether anything can be improved on your neighbour's README file(s).

Think about the user (which can be a future you) of your project, what does this user need to know to use or contribute to the project? And how do you make your project attractive to use or contribute to?

(Optional): Try the https://hemingwayapp.com/ to analyse your README file and make your writing bold and clear.
:::

## In-code documentation

In-code documentation:
- Makes code more understandable
- Explains decisions we made

### When not to use in-code documentation:
- When the code is self-explanatory
- To replace good variable/function names
- To replace version control
- To keep old (zombie) code around

### Readable code vs commented code
```python
# convert from degrees celsius to fahrenheit
def convert(d):
return d * 5 / 9 + 32
```

vs

```python
def celsius_to_fahrenheit(degrees):
return degrees * 5 / 9 + 32
```


::::::::::::::::::::::::::::::::::::::: challenge

## Writing good comments - In-code-1: Comments

Let's take a look at two example comments (comments in Python start with `#`):

**Comment A**

```python
# now we check if temperature is below -50
if temperature < -50:
print("ERROR: temperature is too low")
```

**Comment B**

```python
# we regard temperatures below -50 degrees as measurement errors
if temperature < -50:
print("ERROR: temperature is too low")
```

**Which of these comments is more useful? Can you explain why?**

::::::::::::::: solution

## Solution

+ **Comment A** describes **what** happens in this piece of code. This can be
useful for somebody who has never seen Python or a program, but for somebody
who has, it can feel like a redundant commentary.
+ **Comment B** is probably more useful as it describes **why** this piece of code
is there, i.e. its **purpose**.

:::::::::::::::::::::::::


::::::::::::::::::::::::::::::::::::::::::::::::::

## What are "docstrings" and how can they be useful?

Here is function `fahrenheit_to_celsius` which converts temperature in
Fahrenheit to Celsius.

The first set of examples uses **regular comments**:

```python
# This function converts a temperature in Fahrenheit to Celsius.
def fahrenheit_to_celsius(temp_f: float) -> float:
temp_c = (temp_f - 32.0) * (5.0/9.0)
return temp_c
```

The second set uses **docstrings or similar concepts**. Please compare the two
(above and below):

```py
def fahrenheit_to_celsius(temp_f: float) -> float:
"""
Converts a temperature in Fahrenheit to Celsius.

Parameters
----------
temp_f : float
The temperature in Fahrenheit.

Returns
-------
float
The temperature in Celsius.
"""

temp_c = (temp_f - 32.0) * (5.0/9.0)
return temp_c
```

Docstrings can do a bit more than just comments:
- Tools can generate help text automatically from the docstrings.
- Tools can generate documentation pages automatically from code.

It is common to write docstrings for functions, classes, and modules.

Good docstrings describe:
- What the function does
- What goes in (including the type of the input variables)
- What goes out (including the return type)

**Naming is documentation**:
Giving explicit, descriptive names to your code segments (functions, classes,
variables) already provides very useful and important documentation. In
practice you will find that for simple functions it is unnecessary to add a
docstring when the function name and variable names already give enough
information.

## User/API documentation
* What if a README file is not enough?
* How do I easily create user documentation?

### Tools
You can use the following tools to generate user or API documentation:

#### Sphinx (documentation generator)
- creates nicely-formatted HTML pages out of .md or .rst files
- programming language independent

#### Github pages (deploy your documentation)
- set up inside your GitHub repository
- automatically deploys your Sphinx-generated documentation

::: instructor
You can show the example documentation deployed on GitHub pages here: https://esciencecenter-digital-skills.github.io/good-practices-documentation-example/

Then, you can show that this content comes from simple markdown files, like: https://github.com/esciencecenter-digital-skills/good-practices-documentation-example/blob/main/doc/another-feature.md?plain=1

In addition, you can explain that with a few settings you can automatically generate documentation from docstrings. You can give https://nanopub.readthedocs.io/en/latest/reference/client.html as an example.

:::


:::::::::::::::::::::::::::::::::::::::: keypoints

- Depending on the purpose and state of the project documentation needs to meet different criteria.
- Good README files provide a good landing place for anyone that is new to your project
- Comments should describe the why for your code not the what.
- Writing docstrings can be a good way to write documentation while you type
code since it also makes it possible
to query that information from outside the code or to auto-generate
documentation pages.
::::::::::::::::::::::::::::::::::::::::::::::::::
Loading