Skip to content

hsf-training/analysis-essentials

Repository files navigation

Analysis essentials Build Status Binder

This is the source material for the analysis essentials website, a series of lessons for helping high-energy physics analysts become more comfortable working with the shell, version control, and programming.

The lessons introduce the basics of the bash shell, the git version control system, and the Python programming language. They are developed for and taught during the Starterkit, and aim to teach students enough to be able to follow the experiment-specific lessons that are taught afterwards.

Contributions to the lessons are highly encouraged. Please see the contributing guide for details on how to participate.

{% callout "HSF Software Training" %}

This training module is part of the HSF Software Training Center, a series of training modules that serves HEP newcomers the software skills needed as they enter the field, and in parallel, instill best practices for writing software.

{% endcallout %}

Prerequisites

There are two options for running these lessons. Running locally should be prefered on Linux and macOS as it is faster and makes it easier to save you work. On Windows it is likely easier to use Binder however care is needed to prevent notebooks being lost when the server is shut down.

Local

This tutorial uses Python 3.11 and requires some packages. It is recommended to use mambaforge to install the correct packages. Note: mamba is like conda and can be used interchangeably. "forge" in the name refers to the conda-forge channel, the open-source maintained channel which contains a lot of packages.

To install Conda/mamba you will need to do the following:

  • Install mamba according to the instructions here
  • To add mamba/conda to your shell, follow the instructions after the installation and execute
mamba init
  • In order no not use the base environment (which you almost never should), do
conda config --set auto_activate_base false

Now to use your first Conda/Mamba environment:

  • This will install the above packages. In order to make sure that you install all of the packages needed in the tutorial, you can use the environment.yml file (make sure that the file environment.yml is in the current directory):
mamba env create -f environment.yml
  • Alternatively, you could create an environment with some packages already in this way
mamba create -n analysis-essentials python=3.11 jupyterlab ipython matplotlib uproot numpy pandas scikit-learn scipy tensorflow xgboost hep_ml wget
  • Activate your environment by doing: mamba activate analysis-essentials
  • You can install additional packages by doing: mamba install package_name

You will also need Jupyter to run the examples in this tutorial. Jupyter was already installed in the previous command and can be ran by following the instructions here. Note: You will need Python.

Binder

Click this button: Binder

Usage

You should now be able to use the tutorial.

  • First clone with git:
git clone https://github.com/hsf-training/analysis-essentials.git
cd analysis-essentials
jupyter lab

This should open a Jupyter webpage with the current directory displayed. Navigate to one of the lessons to start the tutorial.

If you have any problems or questions, you can open an issue on this repository.

.. toctree::
    :maxdepth: 3
    :includehidden:
    :caption: Contents:

    python/README.md
    advanced-python/README.md
    shell/README.md
    shell-extras/README.md
    snakemake/README.md
    git/README.md
    CONTRIBUTING.md
    CONDUCT.md
    LICENSE.md