layout | subtitle |
---|---|
lesson |
Lesson Design |
- Audience
- Graduate students (PhD students, postdocs, technicians) in numerate disciplines from cosmology to economics
- Who have used the terminal or at least attended shell-novice
- Who are literate in any given programming language (python, C/C++, Fortran, ...)
- Constraints
- One full day 09:00-17:00
- 06:30 teaching time
- 1:00 for lunch
- 0:30 total for two coffee breaks
- Learners use native installs on their own machines (ssh session)
- May use logins to local cluster
- Dependence on other Carpentry modules
- In particular, shell-novice
- One full day 09:00-17:00
- Exercises will mostly not be "write this code from scratch"
- Want lots of short exercises that can reliably be finished in allotted time
- So use MCQs, fill-in-the-blanks, Parsons Problems, "tweak this code", etc.
- Lesson materials
- Notes for instructors and self-study will be written in Markdown
- We've tried writing/maintaining lessons as Notebooks...
- Learners will be provided with one Notebook per episode containing exercises
- Notes for instructors and self-study will be written in Markdown
- Get learners to the stage decribed in the "Software" section of
"[Good Enough Practices in Scientific Computing][good-enough]".
- Goals
- Make it easy for people (including your future self) to understand and (re)use your code
- Modular, comprehensible, reusable, and testable all come together
- Rules
- Every analysis step is represented textually (complete with parameter values)
- Every program or script has a brief explanatory comment at the start
- Programs of all kinds (including "scripts") are broken into functions
- No duplication
- Functions and variables have meaningful names
- Dependencies and requirements are explicit (e.g., a requirements.txt file)
- This rule is not covered in this lesson
- Commenting/uncommenting are not routinely used to control program behavior
- Use a simple example or test data set to run to tell if it's working at all and whether it gives a known correct output for a simple known input
- Submit code to a reputable DOI-issuing repository upon submission of paper, just like data
- This rule is not covered in this lesson
- Goals
- Enable them to make sense of other onlines tutorials and resources
- Midpoint: plot bar chart showing average GDP per continent
- Final: debug and extend a short multi-function program to handle data laid out differently
How do I...
- ...read, analyze, and visualize a tabular data set?
- ...process multiple data sets?
- ...tell if my program is working correctly?
- ...fix it when it's not?
- ...find and use software other people have written instead of writing my own?
- Run code interactively
- Run code saved in a file
- Write single-condition
if
statements - Convert between basic data types (integer, float, string)
- Call built-in functions
- Use
help
and online documentation - Import a library using an alias
- Call something from an imported library
- Read tabular data into an array or data frame
- Do collective operations on arrays and data frames
- Create simple plots of data in arrays and data frames
- Interpret common error messages
- Track down bugs by running small tests of program modules
- Write non-recursive functions taking a fixed number of named parameters
- Create literate programs in the Jupyter Notebook
- That a program is a piece of lab equipment that implements an analysis
- Needs to be validated/calibrated before/during use
- Makes analysis reproducible, reviewable, shareable
- That programs are written for people, not for computers
- Meaningful variable names
- Modularity for readability as well as re-use
- No duplication
- Document purpose and use
- That there is no magic: the programs they use are no different in principle from those they build
- How to assign values to variables
- What integers, floats, strings, and data frames are
- How to trace the execution of a
for
loop - How to create and index lists
- How to trace the execution of
if
/else
statements - The difference between defining and calling a function
- What a call stack is
- Where to find documentation on standard libraries
- How to find out what else scientific Python offers